Citrix XenApp 7.x VDA Registration State stuck in Initializing and a self healing powershell script to fix it

October 16th, 2014 No comments

On XenApp 7.5 servers (and possibly 7.1 and 7.0), you may notice the registration state of the machine is stuck on “Initializing” in Citrix Studio and no one will be able to launch any apps. I’m still investigating if the 7.6 VDA also has this behavior. This is how it looks in Studio:


You can also run the following powershell command:

Add-PSSnapin Citrix.*.Admin.V*
Get-BrokerMachine | select MachineName, SummaryState, MachineInternalState, 


You will notice the impacted server in question has a MachineInternalState set to “Unavailable” and the RegistrationState is stuck on “Initializing”. We’ve noticed this happening on XenApp servers that have been up around the 24 or 25 day mark. They suddenly stop being registered.

The work around is to restart the “Citrix Desktop Service” on the impacted server. The service is not in a Stopped state so it’s hard to setup monitoring using a 3rd party monitoring tool. I just wrote this PowerShell script that queries the delivery controller for the actual state of registration on every XenApp server to see if it’s stuck Initializing, restart the Citrix Desktop Service on the impacted servers, log a .csv file with the names of the impacted servers for historical purposes, and send an email notification out to the Citrix admins and NOC. I have this running as a scheduled task (stacked a few min apart) every 5 minutes on all my delivery controllers:

##Written by Jason Samuel -

Add-PSSnapin Citrix.*.Admin.V*

$Results = @()
$Date = (Get-Date -DisplayHint Date)
$save_date = $Date.ToString("MM-dd-yyyy-hh-mm-ss-tt")

$Results += Get-BrokerMachine -RegistrationState "Initializing" | select DNSname, AgentVersion

If (!$Results) 

Else {
##Restart the Citrix Desktop Service on stuck servers
foreach ($vda in $Results)

Restart-Service -InputObject $(get-service -ComputerName $vda.DNSName -Name "Citrix Desktop Service") -Verbose
$EmailBodyList += $vda.DNSName + "`t" + $vda.AgentVersion + "`r`n"

##Writes the result to a CSV file on your delivery controller for historical purposes
$file_output = ('D:\Citrix_VDA_Restart\XA_VDA_' + $save_date + '.csv')
$Results | Export-CSV -Path $file_output -NoTypeInformation

##Sends an email with the list of servers in the body as well as the CSV attachment
$filename = $file_output
$smtpServer = “YOURSMTPSERVER”
$msg = new-object Net.Mail.MailMessage
$att = new-object Net.Mail.Attachment($filename)
$smtp = new-object Net.Mail.SmtpClient($smtpServer)
$msg.From = “”
$msg.Subject = “XenApp 7.6 servers stuck Initializing”
$msg.Body = “The below XenApp 7.6 servers are stuck Initializing on the Delivery Controllers. This script is now attempting to 

restart the Citrix Desktop Service on the impacted servers and force the VDA to register.” + "`r`n" + "`r`n" + $EmailBodyList

Start-Sleep -s 5


WordPress will run the code off the page so you can highlight all of the above and paste to Notepad or download it here in text format (make sure you right click – save as). Just change the extension to .ps1 and set your path for the CSV files, SMTP server, and email addresses you want notifications to go to:


Anyhow, you will get a nice notification email like this when servers get stuck. In this example I had 2 servers that were stuck Initializing and by the time the email was received the VDA had already been restarted on both and were Registered again. :)

I’m hoping this is fixed in the the 7.6 VDA and am in the process of testing. I really don’t like to have to setup scripts to monitor service health like this but in a pinch, running a self-healing monitoring script is your best bet to prevent an application outage until the issue is resolved. I’m currently monitoring all servers with the 7.6 VDA and will update this post if I see them exhibit the same behavior as the 7.5 VDA.

Just updated the script above to include the agent version. Here’s how your notification email and .csv file will look:



Citrix console failed to remove the server from the farm with error code 80000007

October 1st, 2014 No comments

Had to clean up a Presentation Server 4.0 farm today. Yes you heard that right, PS 4.0. Yuck. Lots of servers that were decommed but never removed from the farm. I needed to eliminate the junk and see what was really going on with this farm and get the apps migrated to XenApp 7.6. So basically I had a bunch of dead servers that needed to be removed from the farm by force before I could proceed with my analysis of the good apps that were left. If you go through the console and try and remove the each server one by one, you would get this message:

The Presentation Server Console failed to remove the server. Error Code: 80000007


Do the following via command line:


you will see it’s a member of the farm.

So let’s hit the data store and kill it. Do a:

dscheck /full servers /clean

servers will remain most likely since you verified with the qfarm it’s not really an inconsistent record and technically valid. The only option is to forcefully delete it:

dscheck /full servers /deletemf SERVERNAME

Keep in mind the SERVERNAME is case sensitive so when you do a qfarm and the server name is all capital letters, you must type it in as all capital letters. Now do the clean command again:

dscheck /full servers /clean

Now when you reopen the console and do a discovery, the server will be gone for good.

XenDesktop VMs on PVS getting Netlogon 5719 and Group Policy 1129 errors

August 11th, 2014 No comments

After reverse imaging and doing a XenServer Tools update, I had some issues talking to the DC and getting group policies to apply consistenly at boot. When XenServer Tools is updated, it may remove and re-add the NICs. This changes the provider order sometimes.

Symptoms were there were several Netlogon 5719 errors:


This computer was not able to set up a secure session with a domain controller in domain XXXXXXXX due to the following:
There are currently no logon servers available to service the logon request.
This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator.
If this computer is a domain controller for the specified domain, it sets up the secure session to the primary domain controller emulator in the specified domain. Otherwise, this computer sets up the secure session to any domain controller in the specified domain.

and Group Policy 1129 errors in the event log:

The processing of Group Policy failed because of lack of network connectivity to a domain controller. This may be a transient condition. A success message would be generated once the machine gets connected to the domain controller and Group Policy has succesfully processed. If you do not see a success message for several hours, then contact your administrator.


To resolve this I did the following:

1. Verify the streaming NIC is at the top. In my case, NIC 3 is my streaming NIC and NIC 4 is my local network traffic NIC. So it should look like this:


2. Adjusted the provider order so “Microsoft Windows Network” is on top and Symantec or whatever antivirus you are using is at the very bottom:



3. Created the following registry subkey:


Value Name: ExpectedDialupDelay
Data Type: REG_DWORD

Data Range is between 0 and 600 seconds (10 minutes) and the default is 0. I set it to 600 seconds. The thought is Netlogon is starting before the NIC is fully up. This is a timing issue evidenced by group policy errors like the following where the number of ms it set to 0, meaning it didn’t even try to talk to the DC:


Setting the delay to 10 minutes gives the VM some buffer room to get the network up.

How to use PsExec and Xcopy to pull data off a large number of remote machines simultaneously

July 30th, 2014 No comments

My co-worker and I were in a bit of a pinch recently and had to quickly pull data off a large number of VMs as fast as possible. We had a real time crunch to get it done.


Well, Ctrl+C Ctrl+V is one way to do it but you’re better than that. The easier way is to use PsExec and Xcopy/Robocopy to do it. Getting PsExec and Xcopy to play nicely is sometimes a bit tricky. Here’s a really quick and dirty script to get it done. Not the most efficient but it will work in a pinch

1. Copy psexec into a folder on the server you plan to copy your data to. Let’s call this drive D:\DataBackup. Now share it out to Everyone with Read/Write access.

2. Create a .bat file with the following. Let’s call it remotescript.bat and let’s say you are after user profile data. So:

xcopy "\\%computername%\c$\Users" 
"\\yourservername\DataBackup\%computername%\" /e /h /c /y 

Let me explain what it does. Xcopy is invoked to look at the local VM’s name under the c:\Users folder. It then excludes and directories you specify in an excludes.txt file located on the file share you created. Then it copies whatever is left to the file share you created under a directory with the VM’s name. The switches are doing the following

/e – Copies all subdirectories, even empty ones
/h – Copies files with hidden and system file attributes
/c – Ignores errors
/y – Suppresses overwrite file confirmation prompts

and when it’s all done exits.

3. Create the excludes.txt file in this same folder. This will contain all the directories you don’t want to copy over. Here’s an example of mine:

\All Users

You can even do a wildcard where if it begins with something it won’t copy it. For example, if I have several test accounts that all start with “Test_userID” then I would say:


4. In your file share, create a file called VMnames.txt and have 1 name per line. Pretty simple. Export it from wherever you want and massage the data in Excel if you need to. Text to Columns works wonders.

5. Now open up a cmd prompt and go to the folder you have everything staged so far. Run the following where the user ID has admin rights to the VMs:

PsExec -u domain\userID -p xxxxxxxxxx @VMnames.txt -d -c -f remotescript.bat

Now let me explain this. PsExec is invoked and will pass your user ID and password to the remote machines specified in VMnames.txt and run remotescript.bat. Here is what the switches are doing:

/d – Don’t wait for the script to finish running on each VM. Basically you are telling the script to run on all VMs in parallel. Otherwise you’ll be sitting around all day as each VM finishes copying.

/c – Copy the remotescript.bat to the remote machine

/f – Force the copy in the event remotescript.bat was already copied to the machine. Comes in handy if you did some testing on a few VMs first before letting the script loose on all the VMs.

Hope this helps. You can also use Robocopy which is what I prefer over Xcopy. Just modify the above accordingly. Here’s a standalone Robocopy script I like to use to copy all files, empty folders, and ACLs while still retaining time stamps. Comes in handy all the time:

robocopy "c:\SourceFolder" "\\ServerName\c$\DestinationFolder" 
/E /ZB /DCOPY:T /COPYALL /R:1 /W:1 /V 
/TEE /LOG:Robocopy.log

Here is what the switches mean:

/E – Copy subfolders including empty folders
/ZB – Use restartable mode
/DCOPY:T – Copy file directory timestamps
/COPYALL – Copy all file info (data, attributes, time stamps, ACL, owner info, and auditing info)
/R:1 – Retry failed copies once
/W:1 – Wait 1 second between retries
/V – Verbose logging
/TEE – Writes the status to the console window and to the log file
/LOG: – Specifies log file and will overwrite if there is already one named the same

Mitigating DDoS and brute force attacks against a Citrix Netscaler Access Gateway

July 2nd, 2014 1 comment

DDoS and brute force attacks are constantly evolving and becoming more widespread. It’s not just high profile websites like financial institutions and e-commerce sites that get attacked anymore. Any organization can be a target in a matter of minutes. A few sample scenarios could be there’s a press release about a new product your company is releasing so they use a DDoS smoke screen to gain entry into your network to cover data theft. Or your whole industry is getting the attention of the media and you’re one of the top companies in that industry so whichever groups takes you down will be in the spotlight. Or your CEO does something notable and is in the news and so anything associated with him will be in the news including an attack against his company. Or unfavorable legislation passes for your industry and public opinion is low, ripe pickings for hacktivists. Or the attacker just plain felt like attacking an entire group of sites to test out a botnet and yours just happened to be one of them. The possibilities are endless so don’t think you will never be targeted because nothing has happened in the past. Check out for a live view of global DDoS traffic. You should be able to confidently say you have a mitigation plan to keep one of these attacks at bay.

There are several things you can do to protect your Citrix Netscaler Gateway (Access Gateway) from DDoS/DoS and brute force attacks. First off, DDoS protection should be in front of the Netscaler in my opinion. In front of the firewall even. Look into DDoS protection from your ISP if they offer it or an onsite solution that sits in front of the firewall like appliances from Corero or Arbor Networks. You can even protect the ISP router if you like with some of these solutions. There are also DDoS mitigation/scrubbing services like Prolexic (now part of Akamai) or CloudFlare you can look into.

It’s not just traditional SYN flood attacks at the network level you have to worry about anymore. There are reflection\amplification attacks utilizing DNS, NTP, etc. There are fragmented packet attacks causing your appliances to wait around needlessly for missing packets, there are application layer attacks against your web server or app itself, and there are the worst kind which is specialty crafted packet attacks against the appliance or back end server OS itself. There’s always something new around the corner which is why you really need a dedicated on-site appliance or external service to help analyze and mitigate any and all attacks including 0 day attacks. You want to be pro-actively alerted to and remain vigilant against attacks, not scrambling during and after the attack.

Even if you implement all these layers in front of your Netscaler, there are a few things you should configure on the Netscaler itself for an extra layer of security. Netscalers are very powerful devices can can help defend against regular DoS and to some degree DDoS attacks very well. You’ll need a Platinum level device for the full set of security features such as DoS shield, Day Zero attack prevention/Application Firewall, etc. You have the capability to do the following:

1. Application layer defense (Layer 7 defense) – like Slow Read, Slow HTTP POST (RUDY), Slow HTTP Headers (slowloris), HTTP Flood, Random Searches, Apache Range Header, etc. that cause resource starvation on the web servers (not usually a volumetric attack)

  • Application protocol validation (enforcing RFCs thus dropping malformed packets and the ability to create your own validation using Content Filtering or Responder policies)
  • Surge protection/priority queuing (presents traffic to backend servers based on their current capacity while rest gets queued until resources free up. priority queuing can weight certain kinds of requests providing them preference in processing)
  • HTTP flood protection (HTTP GETs are challenged)
  • HTTP low-bandwidth attack protection (monitoring for excess slow connections and proactively reaping)

2. Transport layer defense (Layer 4 defense) – like SYN floods, DNS query floods, SMURF, SSL floods, etc. to starve bandwidth on the web servers by using up sockets (volumetric attack)

  • Full-proxy architecture (nothing in without an explicit policy you set)
  • High performance design (low latency processing, purpose built HTTP parsing engine, hardware accelerators)
  • Intelligent memory handling (memory isn’t allocated until connection is validated)
  • Extensive DNS protection (running either DNS proxy or authoritative modes protects you with DNS specific validation, rate limiting, etc).

3. Network layer defense (Layer 3 defense) – DNS amplification, IMCP floods, UDP floods, teardrop, fraggle, christmas tree, etc. to starve bandwidth on network devices (volumetric attack)

  • Embedded defenses (SYN cookies)
  • Default deny security model (dropping invalid packets that are not defined in your policies)
  • Protocol validation, rate limiting (can be done using AppExpert/Responder policies)

Once you have your stack of DoS/DDoS protection in place, test it. During a maintenance window, attack your own network (please let your Network/Security/IT Management teams, etc. know what you’re doing ahead of time). See how your defenses behave. There’s no point in putting up a giant expensive wall against a flood if it has cracks in it. See what your limits are. Download LOIC, Slowloris, Hping3, HOIC, Kali, etc. and let them loose. If you don’t want to do it yourself, there are tons of vulnerability and penetration testing firms out there that can help you. Make sure there are members of your team that are getting IT-ISAC, FS-ISAC, US-CERT, etc. notifications and actively engaging peers in your business sector to see what they have come across. Ensure your organization has an incident response plan specifically for DDoS attacks.

Now let’s get into the technical stuff. So how do you enable defenses on the Citrix Netscaler? Specifically around the Netscaler Gateway (Access Gateway)?

1. Configure Max Login Attempts & Failed Login Timeout – This can be done globally on the Netscaler Gateway (Access Gateway) but I prefer to do it on the Netscaler Gateway vServer itself. Go to Netscaler Gateway > Virtual Servers > double click your virtual server, and select the number of attempts and timeout. You should set the number of attempts to just 1 attempt below what your AD account lockout policy is set to so that user accounts are protected from being locked out in AD, let the Netscaler do the lockout for you. In this example I have 5 attempts and a 15 minute timeout set in a scenario where my AD account lockout policy is set to 6 failed login attempts:


If a user calls the help desk frantically saying they need to login and the help desk sees their account is not locked out but resets their AD password, they will still need to wait 15 min before attempting to login again on the Netscaler Gateway because it is still locked out there. There was no way for you to be able to “clear” their user ID for them from the Netscaler GUI. I’ve been asking for this feature for sometime and with the 10.5 firmware release yesterday, this is may now be possible (I haven’t had a chance to test it yet though):

Also note, this is all keyed off the user ID, not client IP. If you type “jsmith” for the user ID and try 5 different incorrect passwords, you will get the max login attempts message. But if you try “jsmith”, “jdoe”, “jlee”, etc. it won’t trigger the max login attempts. Also if they just type a user ID and no password, it won’t trigger a login attempt either. I would like to see this changed where any click of the “Log On” button is considered a logon attempt instead and have submitted a feature request for this. My concern is that if an attacker has a list of user IDs (easily guessed or extrapolated), then can sit there and nail the Netscaler Gateway attempting to brute force the password causing an Active Directory account lock out. The help desk gets a call from the user, they unlock the account, and immediately it gets locked again because of the attacker. How’s the user supposed to work if their account keeps getting locked out? As a work around, you can create a responder policy that looks for the POST from the Logon button being clicked to mitigate this threat. Look for a POST going to to help build your policy.

2. Rate limiting – Prevents an attacker from brute forcing the login page. If they hammer the page too many times, the Netscaler will drop the packet. In this example I’m going to configure a Responder policy to drop packets if they exceed more that 4 page requests in 10 seconds.

-Go to AppExpert > Rate Limiting > Selectors and create a new selector with these two expressions:



so your Selector should look like this with them stacked on top of each other:

-Now create a Limit Identifier that references the Selector. For the sake of testing, I have set it to 4 requests within a 10 second period. Please don’t use this in production because the time slice is way too high. This is just so we can test easily:


-Now create a Responder Action if you intend to redirect the user. In this example I want to drop traffic so you don’t need a Responder Action for this. It can be done on the Responder Policy itself. So go ahead and create a Responder Policy like this with an Action of DROP and the Expression to check for the Limit Selector you created before:


-If you do syslogging (I hope you are – see, you can create a log action here so you can be notified in the event of a brute force or DoS attack. Just create a new Log Action like this on your Responder Policy. I like to log it to the newnslog as well. You will need to bypass the safety check to get it to apply or it will warn you it’s unsafe:


You’ll also need to go to System > Auditing > Settings > Change global auditing settings > check the “User Configurable Log Messages” box for both SYSLOG and NSLOG auditing types. Otherwise you’ll see hits on your Responder policy but no hits on your Auditing Message Action policy:


Once done, refresh your Netscaler Gateway login page time 5 times. On the 5th time it should start dropping. Use the HTTPFox FireFox add-on to watch it if you like. You should also be able to go to your Responder Policy and watch the hit count rise. Don’t forget to adjust your threshold and time slice to something more realistic after your testing is done.

You can view your Responder policy hits from the CLI by using:

nsconmsg -K /var/nslog/newnslog -d current | egrep -i responder

and Message Action hits by using:

nsconmsg -K /var/nslog/newnslog -d logfromnfw


show audit messageaction

and to view the actual message:

show audit messages -loglevel ALERT

If you chose to not write to the newnslog and are instead writing to the ns.log, you can use:

tail -f /var/log/ns.log

Just don’t forget about the 7 second delay before the hits appear in your Putty window.

One thing to note, remember the AOL days in the 90’s? Every user was behind the same proxied IP address. Blocking an IP would block pretty much all AOL users. This changed in 2006 but even Wikipedia had trouble blocking malicious AOL users before then: If you do business with another large organization and they are coming through a proxied IP, they could potentially trigger the rate limiting if the time slice is too large and several users tried to login at once. You would unintentionally block legitimate traffic. So keep an eye on this. The syslog email alerts will help you out big time to detect this and make adjustments as necessary before your service is impacted.

3. TCP optimizations – I’ve Tweeted about this before. You should be using the “nstcp_default_XA_XD_profile” on your Netscaler Gateway (Access Gateway) virtual server. This has optimizations in it designed around XenApp and XenDesktop. ICA traffic is a lot of tiny packets so a lot of back and forth communication. This TCP profile speeds up ICA transmission by cramming more packets into each transmission. There are 3 very good blog articles from Citrix by Abhilash Verma covering tuning the TCP stack using profiles:

Now the “Nstcp_default_tcp_lnp” profile is recommended in low bandwidth scenarios so I’m not sure if during a DDoS attack it is best to switch the profile on your vserver temporarily to help keep your ICA traffic flowing. This is something I need to test.



Compare the screenshot between the default TCP profile vs the XA/XD specific one. To break it down:

  • TCP window scaling – increases the receive window size by a factor of 4. This is basically telling the client that the Netscaler Gateway can take on more packets than a typical standalone web server could so “bring it on”. For most scenarios where the client has plenty of bandwidth, this allows for a much better ICA session because it is sending more packets per transmission.
  • MSS – Maximum segment size is exactly what it sounds like, the largest amount of data you can send in one TCP segment. It goes from the default 1460 bytes to 0 bytes. The bigger the segment, the more fragmentation can occur as it travels over all the Internet routers/hops which causes the connection to slow down (latency). The lower the number, the less fragmentation but slightly more overhead on the Netscaler. By dropping the MSS to 0 bytes, I BELIEVE the Netscaler is telling the client it can go as small as the client can can go to prevent fragmentation. But on other network devices I’ve seen an MSS of 0 mean disable so I’m not 100% sure about the setting. I’m going to trust the Netscaler product team’s judgement on this one. :)
  • Max packets per MSS – Bumped from 0 to 10.
  • Max packets per Retransmission – Bumped from 1 to 3. This is saying send 3 packets instead of 1 per retransmission.
  • Minimum RTO – Dropped from 1000 milliseconds to 200 milliseconds. This speeds up retransmission rate drastically.
  • Selective Acknowledgement – This is now checked. SACK allows the receiver to acknowledge out of order blocks of packets which were received correctly and the last packet that was received in the right order. This prevents the sender from having to retransmit data he thought was lost at a slower rate. The sender gets the SACK and just continues transmitting away at high speed. It only has to retransmit packets that were truly lost.
  • Nagle’s algorithm – This is now checked. This reduces the number of packets required to be sent. ICA is a very chatty protocol so tons of small packets. The algorithm buffers small packets and sends them as a group instead of one at a time to increase transmission efficiency.

Now compare that against the lnp profile:


4. Alerting – Use AAA syslogging to alert you of suspicious login activity. Splunk, Kiwi, Command Center, etc. Set an alert that if you see X number of invalid logins within a couple of minutes period, send an email. See

5. External monitoring – Always setup external monitoring of your URLs from multiple data centers around the world. Pingdom, Site24x7, Dotcom-Monitor, etc. are all good services for this and even offer some degree of free monitoring. I prefer extended validation when setting up monitors, not just looking at HTTP response codes or headers. Look for something near the footer of the web page like a copyright line. Until that page is fully loaded, I wouldn’t consider it a successful poll. Make sure to set response time alerts. If a page that normally takes 1 second to fully load is all of a sudden taking 10 seconds, you have a problem and need to know about it.

6. Don’t rely completely on IPS/IDS – Intrusion Prevention Systems and Intrusion Detection Systems are a good layer of security but that’s it, it’s just a layer. Typical HTTP floods will look like legitimate traffic to the IPS. And it will sit there and track each and every one of those connections quickly exhausting resources. An IPS is not designed with DDoS mitigation in mind where there could be tons of legitimate looking traffic thrown at it. It’s designed to inspect individual traffic streams and analyze behavior.

7. Obfuscation – I wrote about using your Netscaler to perform server header obfuscation several years ago to help prevent HTTP fingerprinting: It’s an easy rewrite policy to configure and adds just one more layer of security. Now anyone that works on Netscalers on a daily basis can spot a Netscaler Gateway page a mile away no matter what skin or URL rewrites are being done so don’t rely on this extensively. It’s just an extra layer of security to slow down an attacker and I usually use this on my load balanced vservers more effectively which are usually custom websites than on Netscaler Gateway (Access Gateway) vservers which is pretty common. The article I reference in that post from Citrix is no longer available but here’s a quick example of how I do it:

-Create an Rewrite Action with the Rewrite Action of REPLACE
-Set the expression to:


-Set the replacement string to:


(in my example I set it to “jasonsamuel” for you easily see the change)

-Now create a Rewrite Policy that binds to this action. Undefined Action is:


and Expression should be:


then go ahead and bind this to your Netscaler Gateway vserver. Make sure to choose Rewrite (Response) and not Rewrite (Request) or it won’t bind.

You can also set this globally via GUI. Just go to System > Settings > Configure HTTP Parameters > Server Header Insertion. Check it to enable it and type in whatever you like in the box to set the server header. I never do it this way and prefer to do it via the rewrite policy and bind globally.

Anyhow, use the HTTPFox or Live HTTP Headers FireFox addons to quickly verify the change:




8. Verify management IPs – The Netscaler by default is set to only allow administration on the NSIP for management tasks. But if you walk into a new environment where someone else stood up the Netscaler, they could have enabled it on a SNIP, MIP, or VIP for whatever reason and this is a huge security issue if an attacker uncovered it. Always double check.

9. Verify all local accounts – Audit them. Document them. Change the default nsroot password to a 25 character plus passphrase. Use integrated AD authentication if possible and set granular policies for any admins that don’t need full access to all modules on the Netscaler.

10. Do not enable verbose login error messages talks about enabling Enhanced Authentication Feedback. Don’t do it. Never give an attacker any information to work with. If you’re troubleshooting something, use AAA logging instead.

11. Kill all weak SSL ciphers – I talked about this a few years back: Verify your are allowing only the strong SSL/TLS protocols and weak SSL ciphers are disabled. You only want to enable what you need to pass whatever compliance your organization may have to meet (SOX, PCI-DSS, HIPAA, FERPA, etc.)

12. 2 factor authentication – RSA SecurID, Symantec VIP, etc. are all supported. Tons of SMS based services too if you don’t want to use soft tokens (smartphone apps) or hard tokens (keychain fobs). Remember, the Netscaler Gateway can do primary and secondary authentication. Set the one time password (2 factor auth token) to be the first and set the AD authentication policies to be the second. That way valid AD accounts are protected from lockouts by an attacker. There is no excuse to not use 2 factor authentication in your organization. It is dirt cheap to implement and mitigates so much risk you’d be stupid not to. Use it.

13. Setup Command Center and Insight Center – They’re both free so use them. Depending on your license level you’ll get a certain feature set. If you’ve got a Netscaler, there is no excuse. You need to be running them. The automated config backups of Command Center alone can save you in the event of a catastrophic attack. Insight Center can help uncover unusual behavior and traffic spikes:

Citrix Command Center:

Citrix Insight Center:

14. Managed IPS/IDS & Firewall log review – There are so many managed services out there to review your firewall and IPS/IDS logs in realtime. There are companies with NOCs dedicated to respond to issues 24/7 so you don’t have to. Look into something like this from Alert Logic, Trustwave, SecureWorks, 403 Labs, etc. if you don’t have the resources available on staff. I’ve been in a couple of situations in my career where a phone call from one of these third party solutions uncovered something internal monitoring missed and we were able to respond to the issue before it became a bigger problem. As I said above, make sure you enable syslogging and see if one of these companies can watch for Netscaler related activity for you. Even a spike in CPU you could easily see using SNMP monitoring could indicate a problem and having someone dedicated to monitoring will come in handy.

I hope this article helps you out. Please do leave a comment below if you have any DDoS or brute force attack mitigation techniques you’d like to share.