Forum Discussion
Lerna_Ekmekciog
Nimbostratus
Sep 15, 2006iControl Server Thread Safe ?
I am load testing my custom SOAP service which is a java app that sends SOAP queries to iControl SOAP server. As part of this test I send concurrent requests to iControl server. An example of this is sending n concurrent requests for getting a pool and all its fields, meaning lb method, monitor, member, from the device. However I only get one response from iControl server with the pool and all its fields correctly populated in the SOAP message. All the rest of responses contain fields with no values. That is n-1 empty responses.
When I do ssldump on the communication between the server and my app I can see that the server only sees one request to get lb method and one request to get monitor association as opposed to n requests where n is the number of concurrent requests.
I'm trying to figure out where these n requests are being multiplexed.
Is the iControl SOAP server thread safe ?
Lerna
20 Replies
- Our iControl server is implemented as a singleton with a single FIFO queue as an accessor. You can make parallel requests to the administration web server but it will buffer requests to only allow one at a time.
If you are passing multiple requests in and are in fact getting all responses back but only one containing data, then I'm a bit at a loss as how that could be. If the ssl dump is only showing one request into the server then are you positive the request is making it from your code into the server? How are the fields in the SOAP message getting populated with empty values if no request is made to the server.
I've personally tested our servers with apachebench and haven't come across any issues with parallel requests. If you indeed determine by a network trace on the BIG-IP (or the http access logs) that requests are making it in and are not returning data, then you should contact Product Technical Support where they can dig deeper into your server configuration.
I'm very interested to find out how things work out with this issue so keep in touch.
-Joe - Vladimir_Budilo
Nimbostratus
I'm encountering this issue as well. There are some inconsistencies with the f5 web service. I created a GUI that can invoke "disable/enable PoolMember" methods asynchronously and simultaneously on multiple pool members (using AJAX). I made the application completely thread-safe, but "bulk" disabling and enabling doesn't work the way it should. I receive absolutely no exceptions, but even though the code executes, and a response is received from the f5 web service, but the changes are not committed. (enabling or disabling one pool member at a time always works without any issues).
Now, when I query the web service for the statuses of multiple pool members in "bulk" (meaning asynchronously and simultaneously), the statuses are returned properly.
I've done multiple tests with this, and the conclusion is the same -- when changes are made in bulk, the code on the f5 web service side isn't handling it properly (no exceptions thrown and no changes are made -- it seems that the "change" requests are simply dropped).
I'm working around this issue by synchronizing the disable/enable methods (which wouldn't be necessary otherwise), but this is a real problem and I'd like to know if there is any documentation on this.
Thanks,
Vladimir - For bulk objects, the code in the server implementation essentially repeats the same task for each object. If an error occurs during that process, an OperationFailed exception is returned the the corresponding error code. It's hard for me to guess what could be causing your changes to not be taking hold. Have you tested the method running directly at the BIG-IP and then immediately looked at the GUI to see if it took hold. If you find a case where you make a call and then the GUI doesn't reflect those changes, then this is likely a bug that you'll have to submit to support. But, I've honestly never heard of this happening so I'd love to find a way to try to reproduce it.
Another issue could be in the formatting of how you are configuring the bulk method calls. Can you provide the snippet of code where you generate the method call so that I can verify the inbound parameters are configured properly?
One last option is to turn on iControl tracing on the LTM and looking at the inbound parameters that way. This requires a restart of the webserver so I'd rather not go that route if it's on a production system.
-Joe - Vladimir_Budilo
Nimbostratus
When I say bulk, I simply mean that there are multiple calls to the F5 web service at the same time (even same millisecond). So, let's say, I call disablePoolMember 10 times within the same millisecond (therefore, there are 10 threads running simultaneously on the application server side, where the SOAP calls are made).
When I look at the logs, some of the threads that are sending the data at the exact same time are failing ( e.g. Thread1 & Thread2 are calling the "set pool member state" methods at the exact same time. Thread1 is successful, while Thread2 isn't -- and no exceptions are thrown).
I'm testing this in a DEV environment for now (obviously), but is there a way to turn up tracing without actually bouncing the web server? Is there a "web service" log file I can look at?
Vladimir - Ok, I get you know. iControl is implemented with FastCGI in a single connection configuration that essentially implements a first-in-first-out (FIFO) queue. The process will accept a request and then process one after another in sequential order.
In this thread, I discuss how to enable iControl tracinghttp://devcentral.f5.com/Default.aspx?tabid=53&forumid=1&postid=16431&view=topic
Click here
It will dump the inbound parameters to the /var/log/ltm file including the first 2k of the inbound SOAP request. This log occurs after the request has made it out of the queue and when it hits the web server for processing.
When I first developed the fastcgi implementation, I tested this with 100's of parallel connections of getting/setting values and didn't see any issues with some calls not returning error codes but not doing what you requested. Now, that was a few versions of BIG-IP ago, but from what I'm aware, the implementation hasn't changed.
I'll see if I can simulate a test that does this with 1000's of objects and parallel requests to see if I can reproduce the issue. In the mean time, this most likely will be a issue that you'll have to bring up to product support because it sounds like you are doing everything right. The only thing I can think would be if your parameters are not defined correctly, but if it's working in single mode, just not parallel, then that wouldn't likely be the problem.
-Joe - Vladimir_Budilo
Nimbostratus
Joe, thanks for the reply.
I'm working around the issue by synchronizing the enabling/disabling of the pool members (since simply retrieving the pool member statuses works fine). During your next test, try to manipulate the pool members of the same pool (and if you have 1000's of calls, you won't really know if anything went wrong, so I would recommend having 10 simultaneous calls -- they have to be done at the same millisecond).
Again, thanks for your effort!
Vladimir - Vladimir, I was planning on building an app that will create 10000 pool members, spin up 100 threads and have them each loop over 100 pool members each sending 100 successive disable commands. Then I'll poll all the pool members to look for any that are still in the enabled state. I'll then repeat by enabling them all and checking the state again. I'll repeat this process until I find a discrepancy. If I don't see any issues, I'll look at increasing the numbers and duration. I'm working on a site upgrade today but will try to get to this later this week. I'll update you with what I find.
-Joe - Vladimir_Budilo
Nimbostratus
Joe, here is what's probably going to happen -- out of the 100 disable requests, the first one will fail, while the second......100th request might actually be successful, therefore you won't see the error.
What I suggest is to check the status of the pool member after each request: if the pool member is still enabled, then an error occurred (and just log a special message to a log file).
I'd love to hear your findings.
Thanks!
Vladimir - Vladimir_Budilo
Nimbostratus
An even better way to test this is to "toggle" to pool member state: have 4 threads send simultaneous "disable" requests (and check the status after each disable request), then send 4 simultaneous "enable" requests (and check the status after each request). Repeating this 100 times is much better then just sending 100 disable requests at the same time. - Maybe I didn't make it clear. I was going to have 100 threads sending simultaneous "disable" requests. Each thread would send 100 consecutive requests. I figured that way there would be a much higher chance of a collision. I think that's the same thing you are suggesting. Once the 10000 requests are made, I'll poll all the pool members and if any of them are not in the state I was expecting, then something went wrong. I didn't want to tie in a query (get) in the set logic due to the fact it might lessen the possibility of requesting at the same millisecond.
-Joe
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
