Spam filter in chat

I had enough that i cant use the chat because of the newline spams. Mods usually don't do anything like "dont whisper me, use report so i can ignore you, and let br hackers get your firsts for hours", and now as i noticed spammers use some kind of bots, so they generate new usernames to spam, and moderators cant do anything.

1) create a list of tfm copy sites: if a name of a banned site appears in the chat message (simple regex match), the message won't sent out to other room members, and the sender gets an automatic 1 hour mute (you should store ip addresses or a hash made from them to compare, avoid spammers with multiple accounts, or make the account creation need more time)

2) create a spam filter:
create a hash function that creates a number from the message string, sensitive of small modifications (dot, underline at the beginning, delete from the end...), but improbably detects different strings for spam (silly example: sum all of the numbers of characters in the string)

generate a number from every incoming message
store the last 10 number for example ( rec { var generated_value; var n; } ) for every online user

if a new message sent, compare the output number of the function with the stored values

if the difference is lesser than a predefined delta value, theres a match
n+=10; generated_value = new generated_value; on that one
on all of the other records n-=5;
if theres no match
replace the record with the fewest n, the new n will be 10

if any record reaches 10 multipied by the maximum spam limit: spam detected -> automute for 1 hour
(for example 50 will autoban on the fifth message)

(it needs to develop the hash function, and experiment the values (delta, 10, 5, spam limit) )

This would be great.

Alternatively, to make things less work for the admins, just let us ignore messages that contain certain words/strings? From there it's easy enough to keep a list of bad sites and typical spam talk (collecting these could even be a collaborative project).

I realize that the original suggestion would be nicer, but it'd also be more work and thus less likely to happen.

English	Français
Português do Brasil	Español
Türkçe	Polski
Magyar	Română
العربية	Skandinavisk
Nederlands	Deutsch
Bahasa Indonesia	Русский
中文	Filipino
Lietuvių kalba	日本語
Suomi	עברית
Italiano	Česky
Hrvatski	Slovensky
Български	Latviešu
Estonian

Langue