Forum Discussion

Steve_87971's avatar
Steve_87971
Icon for Nimbostratus rankNimbostratus
Mar 15, 2012

ASM blocks clients that don't accept cookies

 

Hi all

 

I have a HA pair of 3900's running v10.2.3 LTM and ASM, with a blocking policy in place on one of my production VS/applications. We also use OneConnect, the WAN optimised TCP profile and the WAN optimised HTTP profile on this virtual server.

 

From testing it appears that the HTTP class with application security enabled that we've applied to this virtual server is blocking both desktop and mobile browsers from accessing our homepage. Well, I say blocking but the client doesn't get a block page, just a blank page. FYI the class was created by ASM.

 

Using Firefox and Fiddler I can see that the client receives a 504 response (gateway timeout?) and gives up. What is strange is that if the same client hits a URL deeper in the site from a Google link then they can load the page.

 

We have a second VS going to a backup/pre-prod group of servers that run exactly the same application code as the production servers. If I apply all the same iRules, TCP settings etc then cookie-less clients can access the site, it's only when I apply that same HTTP class that it breaks again.

 

 

I'm not an ASM (or LTM!) superstar so can someone help point me in the right direction, please? I'm guessing there's a tick box somewhere...

 

 

Cheers, Steve

 

 

  • Steve,

     

    A questions, is the application on the back-end setting cookies to send to the browser, if so ASM will set its own TS cookie to provide protection of the cookie field. I am not sure how it would handle the client not accepting the cookies.

     

     

    Couple Suggestions

     

    Have you tried just disabling Application Security in the HTTP Class that will essentially turn off ASM functionality and you should be able to rule it in or out of the process.

     

     

    Another thing you could do when ASM is enabled is in the logging profile under the Web Application, turn on Log All Requests and see if you are seeing the request hit the ASM logs.

     

     

    Lastly and probably something you will need if you open a case with support, spin up a tcpdump on the front and back end of the ASM and watch the traffic. I assume you will see the request hit the VS but then check and see if the traffic is going out the back end of the ASM towards the servers.

     

     

    Let us know if any of this helps or if you need clarification on anything.

     

     

    Mike
  • Hi Mike, thank you for your input, it's very much appreciated.

     

     

    Cookies are sent to the browser by (what we term as) the front-end application, and ASM. For clarity, our environment is LTM>ASM>IIS-front-end>IIS-back-end>DB.

     

     

    I'm nervous of logging all traffic as this is a high use application. I might try this out-of-hours.

     

     

    It looks like the web scraping settings caused this issue:

     

    I put the whole ASM policy into transparent mode and the application started working again for clients that don't support cookies. After some digging in the ASM logs (straight off the unit, not through the GUI) I could see my testing machine being blocked for web scraping. After setting ASM back into blocking mode but with scraping set to alarm only, again the application works.

     

     

    So that's got to be scraping, right?

     

     

    Scraping is set-up as follows:

     

    Grace interval = 200

     

    Unsafe interval = 1000

     

    Safe interval = 2000

     

    White-list = all-my-subnets

     

     

    What's strange is that I was getting blocked immediately, i.e. not after 200 requests whilst ASM figures out whether or not I'm human. I guess that means ASM knows pretty instantly that I'm not human because I didn't even accept the cookie? ASM also seemed to ignore the white-list of our subnets.

     

     

    It is a problem for us if we immediately block people who don't allow cookies. My understanding of the grace interval was that it'd allow 200 requests as a kind of free pass *before* it blocks, the ASM config guide states:

     

    "The grace interval is how many requests the system reviews while trying to detect whether the client is human. During the grace interval, requests are not blocked or reported. What occurs next depends..."

     

     

     

    So, two questions:

     

     

    Is the grace interval a free pass or just the most requests that ASM will allow until it decides human/robot?

     

     

    and

     

     

    Why did ASM ignore my white-list?

     

     

    Thanks, Steve
  • In reading the notes about the grace interval it sounds like it uses those request to see if it can determine you to be human if in 200 requests it cannot determine that then you get blocked. However, I think you are not getting the grace interval applied to you because you are not allowing cookies to be set, which is a requirement for Web Scraping to work properly. The ASM is probably just automatically marking you bad because you are not allowing the cookies because it needs those to use the grace period to determine if you are human or not.

     

     

    As far as why did ASM ignore your white list, I am not sure. If you entered the correct IP and subnet of your client then I would assume it would not apply web scraping detection to you. I am not sure if the cookie not being set would throw everything off for web scraping, but in case of simple IP white listing I would not think so. You might want to open a support case on that particular piece and get them to look at it for you. If you do open a case please post your finding back to this thread.

     

     

    Mike
  • Hi Mike, you're absolutely correct with the first paragraph there.

     

     

    The documenation available online though is a little lacking and somewhat ambiguous - it lead my support company, the first-line F5 engineer and myself astray on the expected behaviour of web scraping.

     

     

    Below is the final word from F5 on this which clarifies the position and functioning of web scraping:

     

     

    _____

     

     

    To protect against web scraping, the BIG-IP ASM system attempts to determine if a given HTTP request originates from a human, rather than from a web scraping bot. The mechanisms that BIG-IP ASM uses to determine this are called the Client-Side (CS) challenge, and Client-Side Human User Indicator (CSHUI). One or both may be used depending on the state of the BIG-IP ASM policy settings.

     

     

    When the BIG-IP ASM policy enforcement mode is set to Blocking and the Web Scraping Detection violation is set to Block, the BIG-IP ASM will respond to requests first with the CS challenge injection. If the CS challenge is passed, the CSHUI injection occurs.

     

     

    *NOTE* The second stage of web scraping has to do with a script that identifies human interaction in the browser, which is Client-Side Human User Indicator (CSHUI). This is the script that takes the grace period into consideration. So grace period has nothing to do with the first stage CS challenge. Grace period only applies to the second stage CSHUI when the BIG-IP ASM policy enforcement mode is set to Blocking and the Web Scraping Detection violation is also set to Block.

     

     

    The BIG-IP ASM Client-Side (CS) challenge is an initial step used in the denial-of-service (DoS) prevention, brute force protection, and web scraping prevention components of the BIG-IP ASM anomaly detection feature. The primary function is to ensure that an HTTP request originates from a valid or JavaScript Proper client, and not a bot. A client is considered JavaScript Proper if it meets the following three criteria:

     

     

    The client must support JavaScript

     

    The client must support HTTP cookies

     

    The client must be able to calculate the result of a computational challenge sent by BIG-IP ASM

     

     

    Any client that does not satisfy these criteria is considered a bot, and will not be treated as legitimate traffic.

     

     

    How CS challenge works

     

     

    When an HTTP request is received, the BIG-IP ASM replies with an HTTP response that includes a block of JavaScript that the client must execute in order to complete the computational requirement of the CS challenge. When the client executes the JavaScript, it will automatically return an HTTP POST request to the BIG-IP ASM containing a new cookie with a name of the form TS_75. The value of this cookie will be the result of the computational challenge, which will be verified by the BIG-IP ASM. If the CS challenge is passed, processing continues on to the next appropriate stage for a given request.

     

     

    When working with the anomaly detection features that use the CS challenge, it is important to note that the CS challenge injection does not occur immediately with the first response to a given HTTP request. By default, the BIG-IP ASM will not respond with the CS challenge injection until it has received 10 requests per URL within 5 minutes.

     

     

    During the first CS challenge stage when web scraping is enabled and in blocking, this Javascript qualifies requests and then intercepts any request to these URL's regardless of the source IP. So once 10 requests have been made to the homepage URL, anyone requesting the homepage URL will need to pass the script before it can continue. If the client is unable to run the challenge, ASM drop the request and subsequent requests are served with the script and dropped until the challenge is answered.

     

  • Steve,

     

    Thank for posting the information the F5 gave you, this is really good information on how Web Scraping works. I know I learned a lot here, and I am going to save it off in my support docs for future reference.