Not well versed in the field, but understand that large tech companies which host user-generated content match the hashes of uploaded content against a list of known bad hashes as part of their strategy to detect and tackle such content.
Could it be possible to adopt a strategy like that as a first-pass to improve detection, and reduce the compute load associated with running every file through an AI model?
It’s not as though the existence and mechanisms of piracy are a coveted secret. There’s a decent chance that they’ll learn about and attempt it independently, and the method they learn about online might expose them to greater risk than if they did it with more consideration.
On that basis, I think that knowledge transfer is at worst harm reduction. If it’s immoral, which I don’t believe it is, then at the very least your intervention could prevent them from being preyed upon by some copyright troll company when they do it despite your silence or protestations.