My understanding from attempts to recover a failed disk on a DCD is that each node gets portions of the logs along with the needs to rebuild missing bits. So think of it like a disk RAID set. DCD1 gets first bit, DCD2 gets second bit, DCD3 gets third bit, then DCD1 gets the check bit. And it cycles through each in Databasse Shards and collectively they have all the data and on a failure they can recover the data. This is also why 3 or more DCD are necessary if you want to be able to recover from a DCD failure.
Hi @David_Larsen My DCD is a VM and not a physical appliance. And RAID is maintained by my physical server (Dell).
Also I don't have enough resource in my physical server to install all three DCD nodes.
Since the disk redundancy is maintained by my physical server will I still need to install all three DCD nodes or only one or two will be enough.
It is quite common to have a DCD in each datacentre and use zoning - you should send the BIG-IP logs to the closest DCD.
However, log destination failover is dealt with by BIG-IP ie use a pool to select the destination DCD.
Statistics failure is also dealt with by BIG-IP - the BIG-IP will send statistics to a backup DCD as configured in the same zone ( this is in AVR config )
So the DCDs act as backups in terms of failure, but it is not the case that all logs are replicated to all DCDs. If you send logs to a DCD and it fails, you have lost those logs unless you send the logs to two different DCDs using the BIG-IP publisher. Obviously that doubles the disk usage, network utilisation etc.
I was under the impression that the DCD's where a elasticsearch cluster under the hood in a 2+1 setup.
So you could send the data to one DCD and the cluster would spread the data accross 2 nodes with one copy.
Which would mean if you lost a DCD the data would be presurved, you'd just ned to get the failed unit back online to rebuild the shards and get your reslience back.