That's an interesting question... I haven't had to do packet filters for a long time on F5's (Since 9.1 IIRC).
However I'm surprised that you found pf's slower and using more cpu consumption than iRules... However,w e probably need someone like Spark or another developer to give us a clear answer on who gets the packets first and what part of the switch/tmm/hostkernel/service kernel gets packets first and where the flowpath goes if you use pf's vs iRules. (I really would have expected iRules to consume more CPU though, especially if you're doing DG lookups).
Although perhaps it's something strange like the pf has to run as the management host, so packets need to be transported over to it for filtering (Just like tcpdump), vs the optimised paths that would be in place for TMM and iRule processing of packets...
H