Let’s do one about Citrix Provisioning, Windows Performance Counters, Telegraf and InfluxDB Cloud!
In previous blogs I have shown before the power of visualizing data. Of course there are awesome tools out there. Sometimes there is an issue in a customer environment and you just want to start measuring a.s.a.p.
At my current employer we do a lot of Citrix Provisioning. Normally this performs really well but when there are issues its kind of hard to get the full picture especially when the performance varies and there are a lot of target devices in play. What I did before was to create Powershell script(s) to gather the data and keep this script(s) looping for a couple of days while uploading the data to InfluxDB (so basically a data pull and then upload).
But then I remembered since Provisioning 1909 Citrix added performance counters. hmmm then I could use a Telegraf agent to collect this data for me and natively upload this in the correct format to InfluxDB (so a data push, way more efficient)
You can install InfluxDB locally or in this blog I’m going to use InfluxDB Cloud. For my lab Environment the Free Tier is sufficient enough. Of course it is possible to connect Grafana (Cloud or On Prem) to InfluxDB Cloud but I will not cover this in this blog.
So high over it will look like this
Let’s start rolling!
InfluxDB
First go to InfluxDB Cloud (influxdata.com) and Sign Up
When signed up choose where you would like to store the data. I choose Microsoft Azure/Amsterdam
Choose the plan, I select the Free plan
Within a couple of seconds you can start building.
At first we need a “Bucket”. (e.g. a database)
Give the bucket a name and Create
Telegraf Agent
The next thing to do is to create a configuration for the Telegraf Agent
Select the bucket you created earlier
Select the type of counters you want to configure the agent with.. (many.. many available) but now we need the Windows Performance Counters
Name the Configuration
Default performance counters (CPU, RAM, Disk performance..)… I let them be.. could be handy… of course you can remove them to save data writes..
Let’s add the performance counters:
To add additional performance counters to the config we first need to now wich ones are available
To show available windows performance counters open a console (Provisioning Server in this example) and enter “typeperf -q”
On a Citrix PVS server these are the one I’m interested in.
So let’s add these to the Telegraf agent so these values will be written in the database at interval.
[[inputs.win_perf_counters.object]]
ObjectName = "Citrix PVS StreamProcess"
Instances = ["------"] # Use 6 x - to remove the Instance bit from the counterPath.
Counters = [
"Device Count Forced Reconnect",
"Device Count Cache Failover",
"Device Count Timeout",
"Device Count Active",
"IO-Reply Send failed",
"Vdisk READ failed",
"Vdisk WRITE failed",
"Rejected Login Count - Server Not Available For vDisk",
"Rejected Login Count - Server Busy",
"Rejected Login Count - vDisk Not Available",
"Rejected Login Count - Device Not Found",
"Total Target Login Attempts",
"Total Target Reconnect Count",
]
Measurement = "PVS_Stream"
Save and test the configuration
Next screen i suggest to copy paste these values and save them somewhere… you will need them later on.
Next, download the Telegraf agent , you can generate the correct download url here: Downloads (influxdata.com)
on writing 1.23.3 is the current version. You can use this powershell command to download and extract the telegraf agent. Issue the command on each PVS server (or on some other machine, extract the files and deploy by any other means like with Ivanti Automation.. be creative.. up to you)
wget https://dl.influxdata.com/telegraf/releases/telegraf-1.23.3_windows_amd64.zip -UseBasicParsing -OutFile telegraf-1.23.3_windows_amd64.zip
Expand-Archive .\telegraf-1.23.3_windows_amd64.zip -DestinationPath 'C:\Program Files\InfluxData\telegraf'
Next is to configure the API key as environment variable. The command you saved earlier to set the variable does not work for Windows… I also want to set this as an system variable so I can start telegraf as a service later on. This variable needs to be on every machine where telegraf is going to run!
Open powershell as administrator (setx was the command only stable way for me...)
CMD /C “setx /m INFLUX_TOKEN "YOURAPITOKEN"”
Now we need to install Telegraf as a service:
The files are extracted in a subdirectory of C:\Program Files\InfluxData\telegraf
In this case I downloaded version 1.23.3 so the directory is
"C:\Program Files\InfluxData\telegraf\telegraf-1.23.3"
To install as service issue the command, replace : "YOURCONFIGID" with the configID you noted down earlier
C:\"Program Files"\InfluxData\telegraf\telegraf-1.23.3\telegraf.exe --service install --config https://westeurope-1.azure.cloud2.influxdata.com/api/v2/telegrafs/YOURCONFIGID
to test we need a new powershell session (otherwise token variable is not yet known)
C:\”Program Files”\InfluxData\telegraf\telegraf-1.23.3\telegraf.exe --config https://westeurope-1.azure.cloud2.influxdata.com/api/v2/telegrafs/09cf588e39ffd000 --test
When ok, the console will spit out the performance counters once as a test
if not (you did not open a new console or used the wrong token or config url) you will get a 401
Of course you rock and the test succeeds .
You can start the service now
C:\Program Files\InfluxData\telegraf\telegraf-1.23.3\telegraf.exe --service start
Lets go to Influx again and see if the data flows in!
Yeah with the default counters there plus the csutom PVS_stream measurement!
Lets open it up further
And the tags are here!
And the host tags working (only one PVS server with telegraf now)
When defined your query click on RUN on validate the data
And a table view of the data shows… I don’t like tables that much but love graphs..
So click on visualize the result.. (sorry only one PVS target active right now so not much excitement here)
You can repeat above steps with the filters you like or combine them all (the power of influx!)
a very nice feature is to create a dashboard!
Change name and click ADD CELL and define the query you like to graph. In this example I combine all the device counts in one Graph
Top left you can change the graph type
When happy click the checkmark
And there is the first graph on your new dashboard
Continue until you have a dashboard with everything you want! When its running longer you get something like this…
PVS Target Devices
Next is to add the Writecache usage of the PVS targets to Telegraf so we get the complete picture
Sadly default these Windows performance counters don’t exist. But in 2016 Remko Weijnen wrote a blog how to get the Writecache usage as performance counters. Couple of years old but still relevant.
Link to the installer: https://app.box.com/s/25tt2xs62hyv48e62daat44cy1hrlhd1
Download/install/configure the performance counters according the blog in the link.
Execute in command prompt on a target device: TypePerf -q
Yes here they are!
Let's add these to the telegraf agent
[[inputs.win_perf_counters.object]]
ObjectName = "PVS Memory Usage"
Instances = ["------"] # Use 6 x - to remove the Instance bit from the counterPath.
Counters = [
"PVS RAM Cache (MB).",
"PVS MetaData (MB).",
"PVS Write cache VHD Disk size (MB).",
"PVS RAM Cache Percent Used %",
]
Measurement = "PVS_Target_CacheUsage"
So in the end my Telegraf configuration looks like this:
Now repeat al the steps with the telegraf agent but now in your golden image, create a dashboard and you can have all the targets in one dashboard to!
Next cool thing you can do is link influxDB cloud to Grafana (cloud or on premise) to create even better dashboards but that’s something for another time!
Conclusions & Notes :
I focused on the PVS counters but of course all the windows performance counters can be used, be creative!
Internet connection is required on devices where the Telegraf agent is running.
Really fast to setup! I personally automated the install and configuration of the Telegraf Agent in a Ivanti Automation module, I would suggest to do something alike. Of course you can contact me if you have an interest in the AM module.
I created one config file for the PVS server and target devices. Of course you can split these up if you want to. Combining works fine in this case, cause the PVS server counters don’t exist on the PVS target and vice versa the PVS targets. But I can imagine when you start adding more counters, you split this up for administrative purposes.
Additional info on the Citrix performance counters: What’s new | Citrix Provisioning 1912 LTSR
Great Article about InfluxDB(What/When/What): InfluxDB – top database management for time series databases - IONOS
And some more about the Telegraf Agent: Telegraf Open Source Server Agent | InfluxDB (influxdata.com)
Comments