I'm part of a team that manages the Wells Fargo Community. It's not an exaggeration to say that we consider the risks associated with just about every aspect of user-generated content within our external-facing community. While there will always be some risks in an external-facing community, we have taken a variety of steps to mitigate them.
Here are some examples that pertain to your question:
- the only files that we currently allow users to upload are avatars and profile images
- every avatar and profile image is reviewed by a moderator prior to being posted in the community
- we allow members to include website addresses/domain names in their posts, but we use the interceptors to send them to the moderation queue before they go live
- a moderator reviews the website addresses/domain names using several tools to determine if they go to a malicious destination prior to reviewing the site itself
- the moderator then makes sure the content on that site is appropriate and on-topic
- at that point the moderator will approve the post so it appears on the site
We realize that these policies and procedures may not make sense for every community. With that said, they are precautions that help us sleep at night.
Hope that helps.
It's definitely a challenge trying to balance vigilance without stifling interaction in the community.
On SCN, we do not review anything before it's published and rely on the community to report abuses. Of course, this isn't enough. So in addition to the keyword interceptor, I use Google Alerts to report when spammy keywords are used on the site. While this won't catch everything, it will catch what Google sees and that's my biggest concern because lots of spam will affect search rankings which will affect site traffic. I've got alerts setup for every term I can think of related to pills, porn, gambling, forex and then some. (I became a bit paranoid that my company might log my activity as suspicious when I was typing in all these terms ...so far I heard nothing.) Google alerts is also helpful for keywords that can't be blocked by the keyword interceptor, in our case the term casino since we have clients that are casinos.
We also have a very limited set of file types that can be uploaded: .asc, .txt, .text, .xml, .xsl, .gif, .png, .jpeg, .jpg, .jpe