InterChat LogoInterChat

    Content Filtering & Safety

    Set up automated content filtering to keep your hub safe and welcoming. Learn pattern matching, filter actions, and best practices.

    Content Filtering & Safety

    Automated content filtering is your first line of defense against inappropriate content. This guide covers setting up effective filters, understanding pattern matching, and implementing a layered safety approach.

    What You'll Learn

    • How to configure automated content filtering
    • Pattern matching techniques for effective filtering
    • Choosing the right filter actions
    • Building a comprehensive safety strategy
    • Balancing automation with human moderation

    Understanding Content Filtering

    How Filtering Works

    InterChat's content filtering system operates in real-time:

    1. Message Analysis - Every message is checked against your filter patterns
    2. Pattern Matching - Text is compared using various matching techniques
    3. Action Execution - Violations trigger automated responses
    4. Logging - All filter actions are recorded for review

    Filter Actions Explained

    Prevents message from appearing

    When to use:

    • Obvious inappropriate content
    • Spam patterns
    • Prohibited links or content
    • Clear rule violations

    What happens:

    • Message is stopped before reaching other servers
    • Original sender sees their message normally
    • Other servers never see the blocked content
    • Action is logged for moderator review

    Best for: High-confidence violations that should never appear

    Automatically removes user from hub (10 minutes)

    When to use:

    • Severe violations (hate speech, harassment)
    • Spam bots or malicious accounts
    • Content that requires immediate removal
    • Patterns indicating bad faith participation

    What happens:

    • User is immediately blacklisted for 10 minutes
    • All their recent messages are flagged for review
    • They cannot participate until blacklist expires
    • Moderators are notified of the action

    Best for: Serious violations requiring immediate action

    Notifies moderators for human review

    When to use:

    • Borderline content needing context
    • New or experimental filter patterns
    • Content that might have false positives
    • Situations requiring human judgment

    What happens:

    • Message appears normally in all servers
    • Alert sent to moderators with violation details
    • Moderators can take action if needed
    • Helps refine filter patterns over time

    Best for: Content that needs human evaluation

    Setting Up Content Filtering

    Access Filter Configuration

    /hub config anti-swear hub:YourHubName

    This opens the content filtering management interface.

    Create Your First Filter

    1. Click "Add New Filter"
    2. Enter your pattern (see pattern guide below)
    3. Choose the appropriate action
    4. Add a description for your team
    5. Save the filter

    Test Your Filters

    • Start with "Send Alert" action for new patterns
    • Test with trusted community members
    • Monitor alerts for false positives
    • Adjust patterns based on results

    Refine and Optimize

    • Review filter logs regularly
    • Update patterns based on new violations
    • Remove or modify filters causing false positives
    • Share effective patterns with your moderation team

    Pattern Matching Guide

    Pattern Types

    Pattern: word

    Matches: Only the exact word "word"

    Examples:

    • ✅ "word" (matches)
    • ❌ "words" (doesn't match)
    • ❌ "keyword" (doesn't match)
    • ❌ "sword" (doesn't match)

    Best for:

    • Specific prohibited terms
    • Exact phrases or commands
    • Brand names or specific references
    • When precision is critical

    Use case: Blocking specific slurs or exact spam phrases

    Pattern: word*

    Matches: Any word starting with "word"

    Examples:

    • ✅ "word" (matches)
    • ✅ "words" (matches)
    • ✅ "wordsmith" (matches)
    • ❌ "keyword" (doesn't match)
    • ❌ "sword" (doesn't match)

    Best for:

    • Word variations and plurals
    • Terms with common suffixes
    • Catching creative spelling attempts
    • Broad category filtering

    Use case: Blocking "spam*" to catch "spam", "spammer", "spamming"

    Pattern: *word

    Matches: Any word ending with "word"

    Examples:

    • ✅ "word" (matches)
    • ✅ "keyword" (matches)
    • ✅ "password" (matches)
    • ❌ "words" (doesn't match)
    • ❌ "wordsmith" (doesn't match)

    Best for:

    • Common word endings
    • Suffix-based violations
    • Catching variations of prohibited terms
    • Technical term filtering

    Use case: Blocking "*bot" to catch "spambot", "autobot", etc.

    Pattern: *word*

    Matches: Any text containing "word"

    Examples:

    • ✅ "word" (matches)
    • ✅ "words" (matches)
    • ✅ "keyword" (matches)
    • ✅ "sword" (matches)
    • ✅ "wordsmith" (matches)

    Best for:

    • Broad content filtering
    • Catching creative spelling
    • General topic restrictions
    • When context doesn't matter

    Use case: Blocking "discord.gg" to catch all Discord invites

    ⚠️ Warning: Can cause many false positives

    Advanced Pattern Strategies

    Layered Filtering Approach:

    1. Exact matches for known violations
    2. Prefix/suffix matches for variations
    3. Contains matches for broad categories
    4. Alert actions for borderline content

    Example Filter Set for Spam Prevention:

    Pattern: "discord.gg" | Action: Block | Description: Discord invite links
    Pattern: "bit.ly*" | Action: Send Alert | Description: Shortened URLs for review
    Pattern: "*free nitro*" | Action: Blacklist | Description: Common scam phrase
    Pattern: "join my server" | Action: Send Alert | Description: Potential server advertising

    Building Effective Filter Lists

    Common Filter Categories

    Profanity and Hate Speech:

    Approach:

    • Start with exact matches for clear violations
    • Use prefix/suffix for variations
    • Consider context and community standards
    • Regular review and updates needed

    Example Patterns:

    [slur] | Block | Exact hate speech term
    [profanity]* | Block | Profanity and variations
    *[offensive-term]* | Send Alert | Borderline content

    Best Practices:

    • Research community-specific terms
    • Consider cultural and regional differences
    • Balance strictness with false positives
    • Have clear appeal processes

    Spam and Scam Prevention:

    Common Spam Patterns:

    *free nitro* | Blacklist | Nitro scams
    *click here* | Send Alert | Suspicious links
    discord.gg | Block | Invite links
    *dm me* | Send Alert | Potential spam
    *check my bio* | Send Alert | Profile spam

    Scam Indicators:

    • Promises of free premium services
    • Urgent action required language
    • External link requests
    • Too-good-to-be-true offers

    Strategy:

    • Block obvious scams immediately
    • Alert on suspicious patterns
    • Monitor for new scam techniques

    NSFW Content Filtering:

    Technical Approach:

    • Enable built-in NSFW detection
    • Add text-based NSFW filters
    • Consider community standards
    • Clear policies on borderline content

    Example Filters:

    *nsfw* | Block | NSFW references
    *18+* | Send Alert | Age-restricted content
    [explicit-terms] | Block | Sexual content

    Considerations:

    • Community age demographics
    • Professional vs. casual environments
    • Cultural sensitivity
    • Clear content policies

    Server Advertising Prevention:

    Advertising Patterns:

    discord.gg | Block | Discord invites
    *join my server* | Send Alert | Direct advertising
    *new server* | Send Alert | Server promotion
    *looking for members* | Send Alert | Recruitment

    Balanced Approach:

    • Block direct invite links
    • Alert on promotional language
    • Allow legitimate server mentions
    • Consider hub purpose (networking vs. focused discussion)

    Exceptions:

    • Collaborative project invitations
    • Educational server sharing
    • Hub-approved partnerships

    Filter Maintenance

    Regular Review Schedule:

    Weekly:

    • Review alert logs for new patterns
    • Check for false positives
    • Update patterns based on violations

    Monthly:

    • Comprehensive filter effectiveness review
    • Community feedback incorporation
    • Pattern optimization and cleanup

    Quarterly:

    • Complete filter strategy assessment
    • Community standards review
    • Technology and trend updates

    Advanced Safety Features

    Hub-Level Safety Settings

    /hub config settings hub:YourHubName

    Essential Safety Settings:

    Block NSFW 🔞

    • Automatically detects and blocks inappropriate images
    • Uses Discord's built-in NSFW detection
    • Recommended for most communities

    Spam Filter 🛡️

    • Detects and prevents spam messages
    • Includes rate limiting and pattern detection
    • Should always be enabled

    Hide Links 🔗

    • Prevents all links from being sent
    • Extreme measure for high-risk situations
    • Consider impact on legitimate sharing

    Block Invites 📨

    • Prevents Discord invite links
    • Reduces server advertising and raids
    • Consider hub purpose (networking vs. focused)

    Use Nicknames 👤

    • Shows server-specific nicknames
    • Can help or hurt depending on community
    • Consider consistency vs. personalization

    Reactions 😀

    • Allows cross-server emoji reactions
    • Generally positive for community building
    • Can be disabled if misused

    Message Formatting

    • Rich embeds and formatting
    • Usually beneficial for communication
    • Disable if causing display issues

    Layered Security Approach

    Level 1: Automated Filtering

    • Content filters for obvious violations
    • NSFW detection for inappropriate images
    • Spam filters for repetitive content
    • Invite blocking for unwanted advertising

    Level 2: Community Reporting

    • Easy reporting system for users
    • Quick moderator notification
    • Context-aware human review
    • Community-driven safety

    Level 3: Active Moderation

    • Human oversight and intervention
    • Complex situation handling
    • Appeal and review processes
    • Community relationship building

    Level 4: Emergency Response

    • Rapid response for serious violations
    • Temporary hub restrictions
    • Coordination with Discord Trust & Safety
    • Crisis communication plans

    Best Practices

    Filter Design Principles

    Start Conservative:

    • Begin with "Send Alert" for new patterns
    • Gradually increase strictness based on results
    • Monitor for false positives carefully
    • Adjust based on community feedback

    Be Specific:

    • Prefer exact matches when possible
    • Use broader patterns only when necessary
    • Document the reasoning behind each filter
    • Regular review and refinement

    Consider Context:

    • What's appropriate varies by community
    • Professional vs. casual environments
    • Age demographics and cultural factors
    • Hub purpose and goals

    Community Communication

    Transparency:

    • Explain filtering policies clearly
    • Provide examples of what's not allowed
    • Offer appeal processes for mistakes
    • Regular policy updates and communication

    Education:

    • Help members understand the rules
    • Provide guidance on appropriate content
    • Share the reasoning behind policies
    • Encourage positive community culture

    Avoiding Common Pitfalls

    Over-Filtering:

    • Too many false positives frustrate users
    • Overly broad patterns catch innocent content
    • Excessive automation reduces human judgment
    • Community feels unwelcome or restricted

    Under-Filtering:

    • Inappropriate content damages community
    • Members leave due to poor environment
    • Reputation and growth suffer
    • Moderation team becomes overwhelmed

    Inconsistent Enforcement:

    • Different standards for different users
    • Unclear or changing policies
    • Lack of documentation and training
    • Community loses trust in moderation

    Monitoring and Analytics

    Key Metrics to Track

    Filter Effectiveness:

    • Number of violations caught
    • False positive rate
    • Community feedback on filtering
    • Moderator workload reduction

    Community Health:

    • Report frequency and types
    • Member satisfaction with safety
    • Retention rates and growth
    • Quality of discussions

    Regular Assessment

    Monthly Filter Review:

    1. Analyze filter logs and statistics
    2. Identify patterns in violations
    3. Assess false positive rates
    4. Gather community feedback
    5. Update filters based on findings

    Quarterly Safety Audit:

    1. Comprehensive policy review
    2. Community standards assessment
    3. Technology and trend updates
    4. Moderation team training needs
    5. Strategic safety planning

    Next Steps

    Your community is safer! Continue building comprehensive safety with these advanced topics:


    Need help with content filtering? Join our support community to get advice from experienced moderators and safety experts!