gosh-darnit

A fast, efficient Go library for profanity detection and censorship.
Features
- Fast: Uses the Aho-Corasick algorithm to match all patterns in a single pass
- Smart word boundaries: Prevents false positives like "bass", "analyst", "assist", "Scunthorpe"
- Evasion resistant: Handles common obfuscation techniques:
- Leetspeak:
@ss, sh1t, fvck, a$$
- Unicode homoglyphs: Cyrillic, Greek, fullwidth characters
- Zero-width characters: U+200B, U+200C, U+200D, U+FEFF
- Repeated characters:
fuuuuck, shiiiit
- NFKC Unicode normalization
- Flexible censoring: Multiple modes for replacing profanity
- Zero external dependencies: Only uses Go standard library +
golang.org/x/text
Installation
go get github.com/geoherna/gosh-darnit
Usage
Basic Detection
package main
import (
"fmt"
"github.com/geoherna/gosh-darnit"
)
func main() {
// Check if text contains profanity
if goshdarnit.IsProfane("What the fuck?") {
fmt.Println("Profanity detected!")
}
// Find which words matched
words := goshdarnit.FindProfanity("This is some shit")
fmt.Println("Found:", words) // ["shit"]
}
Censoring
package main
import (
"fmt"
"github.com/geoherna/gosh-darnit"
)
func main() {
text := "What the fuck is this shit?"
// Replace all characters with asterisks
fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorAll))
// Output: "What the **** is this ****?"
// Keep first character visible
fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorKeepFirst))
// Output: "What the f*** is this s***?"
// Keep first and last characters visible
fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorKeepFirstLast))
// Output: "What the f**k is this s**t?"
}
Evasion Detection
The library automatically handles common evasion techniques:
// Leetspeak
goshdarnit.IsProfane("@ss") // true (@ -> a)
goshdarnit.IsProfane("sh1t") // true (1 -> i)
goshdarnit.IsProfane("fvck") // true (v -> u)
goshdarnit.IsProfane("a$$") // true ($ -> s)
// Repeated characters
goshdarnit.IsProfane("fuuuuck") // true
goshdarnit.IsProfane("shiiiit") // true
// Unicode homoglyphs (Cyrillic 'а' looks like Latin 'a')
goshdarnit.IsProfane("аss") // true
False Positive Prevention
Word boundary detection prevents common false positives:
goshdarnit.IsProfane("The bass is great") // false
goshdarnit.IsProfane("She's an analyst") // false
goshdarnit.IsProfane("I need to assist you") // false
goshdarnit.IsProfane("Scunthorpe is a town") // false
goshdarnit.IsProfane("The shitake mushrooms") // false
goshdarnit.IsProfane("Assess the situation") // false
goshdarnit.IsProfane("Classic movie") // false
API Reference
Functions
| Function |
Description |
IsProfane(text string) bool |
Returns true if text contains profanity |
ContainsProfanity(text string) bool |
Alias for IsProfane |
Censor(text string, mode CensorMode) string |
Replaces profanity with asterisks |
CensorWithDefault(text string) string |
Censors with CensorAll mode |
FindProfanity(text string) []string |
Returns list of matched profane words |
Censor Modes
| Mode |
Example |
Description |
CensorAll |
**** |
Replace all characters |
CensorKeepFirst |
f*** |
Keep first character visible |
CensorKeepFirstLast |
f**k |
Keep first and last characters visible |
Benchmarks on Apple M4 Max:
| Benchmark |
Time |
Allocations |
| CleanShort |
~766ns |
8 allocs |
| ProfaneShort |
~839ns |
9 allocs |
| Leetspeak |
~847ns |
9 allocs |
| RepeatedChars |
~1.0µs |
11 allocs |
| MixedText |
~2.5µs |
14 allocs |
Run benchmarks yourself:
go test -bench=. -benchmem
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Special Shoutouts
Huge thanks to John Kim for the graphic design assets used in this project.
Note on content
This software contains a list of profanities, slurs, and other offensive terms
solely for the purpose of detecting and filtering harmful language in
user-generated content. These terms are included for harm-reduction, research,
and moderation purposes only. Their presence in the source code does not
constitute endorsement or promotion of such language by the authors.
License
MIT License - see LICENSE for details.