How To Use Janitor AI? [2024]

Janitor AI is an artificial intelligence system designed to help keep online communities positive. It uses advanced natural language processing to analyze conversations and remove toxic, harassing, and harmful content.

Janitor AI was created by Anthropic, an AI safety startup focused on building beneficial AI systems. It is currently being tested on various websites and forums as a content moderation tool.

Some key capabilities of Janitor AI include:

  • Detecting and removing hate speech, threats, insults, and identity attacks
  • Understanding context to differentiate harmful from benign uses of potentially offensive language
  • Providing explanations when removing content to improve transparency
  • Operating in a privacy-preserving way without accessing user identities

Getting Access to Janitor AI

As Janitor AI is still in development, access is currently limited. However, Anthropic plans to make it more widely usable over time. Here are some ways you may be able to get access:

  • Apply for early access as a beta tester. Anthropic is selecting individual users, moderators, and communities to help improve Janitor AI.
  • Join the waitlist on the Anthropic website to be notified when Janitor AI releases more broadly.
  • Reach out to the Anthropic sales team if you operate a platform or community interested in piloting Janitor AI.
  • Follow Anthropic announcements to get news of public release plans for Janitor AI. Signing up for the newsletter is a great option.

If granted access, Anthropic will provide documentation on properly configuring and using Janitor AI for your use case.

Using Janitor AI on Your Community

If you’ve been granted Janitor AI access for your online community, here is a step-by-step guide on putting it to use:

Set-up Process

  1. Work with Anthropic to configure Janitor AI for your platform based on your content rules and guidelines. Ensure proper scoping of analysis.
  2. Implement the Janitor AI API in your system using the provided instructions and sample code. Proper API keys are critical for activation.
  3. Adjust the user interface on your platform to show Janitor AI removal explanations. Make users aware of this new moderation system.

Ongoing Use

  1. Monitor Janitor AI platform analytics to assess its overall efficiency, including unnecessary or missed removals. Provide ongoing feedback to Anthropic to improve accuracy.
  2. Review user appeals when they believe Janitor AI has incorrectly removed their content. Adjust configurations if needed based on appeal results.
  3. Consider implementing pre-screening of user content with Janitor AI before it is posted live. This can enhance protection of your community.
  4. Stay updated on new features, capabilities, and best practices for leveraging Janitor AI as they become available from Anthropic.

Customizing Janitor AI Settings

Janitor AI provides a range of customization options so you can best configure it for your community’s needs:

  • Set prioritization levels for analyzing different rule violations based on severity. For example, threats can be set higher than insults.
  • Choose enforcement actions like removing, hiding, appending warnings, or just monitoring different violation types detected. Granular configurations are possible.
  • Specify which content fields to analyze, such as post bodies, subject lines, usernames, headlines and even images. Focus Janitor in the right areas.
  • Select the types of analysis to enable like hate speech, sexual harassment, self-harm risk, and toxicity. Disable any areas less relevant for your community.
  • Filter setting adjustments for intensity thresholds, false positive tolerance, reviewer disagreement toggles to fine tune Janitor’s accuracy.
  • Allow list exceptions for situations with benign rule violations, like quoting harassing text as examples. This helps avoid incorrect removals.

Ongoing accuracy improvements require providing consistent feedback and quality data back to Janitor AI so it can learn from both good and bad calls. Leverage the various customization hooks to ensure a tailored configuration for the unique needs of your users and platform.

Primary Use Cases

Janitor AI is a versatile AI system for moderating digital communities. Here are some of its most common and impactful use cases today:

  1. Social Media Platforms: Major social networks suffer from scaling content moderation. Janitor AI can detect policy violations at tremendous scale to keep these platforms safer.
  2. News Comment Sections: Online comments often quickly fill with toxicity. Janitor AI excels at finding the harmful ones automatically so moderators can focus elsewhere.
  3. Gaming Chat: In-game chat platforms struggle with harassment issues dampening users’ enjoyment. Janitor AI helps create healthier game communities.
  4. Dating Apps: Creating safe, respectful dating platforms is crucial for user comfort and retention. Janitor AI protects users from uncomfortable encounters.
  5. Marketplaces: Customer toxicity can be prevalent and hurt brand reputations. Janitor AI removes bad behaviors from community marketplaces and forums.
  6. Higher Education: Schools rely on discussion boards for learning but want to protect students. Automatically detecting concerning posts with Janitor AI enables this.
  7. Peer Support Groups: Some health support communities have faced challenges keeping spaces positive. Janitor AI’s sensitivity excels in these contexts.
  8. Internal Company Chat: Ensuring professional, inclusive communication at work is top priority. Janitor AI fills difficult content gaps legacy filters miss.

With custom analysis capabilities covering threats, bullying, racism, mental health risk, and even microaggressions, Janitor AI provides value across nearly any digital community.

Benefits of Using Janitor AI

Implementing Janitor AI for community moderation and user safety provides a wide range of valuable benefits:

  1. Saves substantial human moderator time by automatically handling content policy enforcement at scale.
  2. Provides 24/7 coverage capacity exceeding human response rates for near real-time content analysis.
  3. Reduces negative brand impact and PR crises caused by toxic content and harassment issues.
  4. Increases user comfort and sense of safety by proactively keeping spaces positive.
  5. Allows redirecting human resources to more complex moderation cases like appeals, warnings, bans etc.
  6. Flexible customization settings enable implementations tailored to each community’s unique needs.
  7. Ongoing active learning and improvements make Janitor AI’s accuracy quite high over time.
  8. Granular analysis helps catch policy violations legacy filters often miss, like subtle harassment.
  9. Detailed violation reporting provides insights to improve guidelines and reduce rule infractions.

With powerful artificial intelligence capability exceeding traditional filters and massive scalability impossible via manual means, Janitor AI is a game-changing solution for community moderation and governance. The improvements in safety, comfort and savings strongly outweigh costs making this an easy win for credentialed platforms ready to pilot.

Implementation Challenges

While Janitor AI delivers immense value, certain implementation hurdles can arise needing awareness and mitigation:

  1. Poor API integration leading to analysis gaps or errors prevents Janitor AI from working properly. Carefully following instructions avoids this.
  2. Users outraged over removed content is common initially. Clear explanations of violations help overcome this challenge until Janitor AI earns community trust over time through accuracy.
  3. Potential free speech concerns exist around automated removals. Anthropic’s focus on justification transparency and appeal processes minimizes legitimate critiques.
  4. Harvesting quality training data from user bases to improve relevancy can take effort requiring sincerity around inclusion and privacy protection during methodology.
  5. Ongoing configuration tuning is crucial as community norms evolve plus new use cases like images/multimedia arise. Dedicated mod teams help address this workload.
  6. Impatience with AI learning curves can set in if unrealistic accuracy expectations aren’t tempered from the start. Clear communications are essential so users understand the pilot nature.

With deliberate project planning, stakeholder education and steadfast leadership commitment, potential speed bumps during Janitor AI adoption can be effectively managed. But anticipating these likely issues from the outset enables smoother impact at community scale.

Measuring Success

Quantifying program success is key for proving Janitor AI’s value and ROI over time. Track theseprimary metrics:

  1. Policy Violation Rate – Benchmark the % of posts/messages violating policies pre-launch. Target a significant reduction post-launch, like 25-50%+ less.
  2. Human Moderation Time – Measure hours spent manually reviewing flagged user content pre-launch. Target major reductions in human workload post-Janitor AI launch, such as 40-60%+ less time spent.
  3. Appeal Overturn Rate – Analyze the % of appealed post removals that get overturned as incorrect calls. Target substantial improvements in Janitor AI precision over time, including <15% overturn rate after sufficient learning.
  4. User Sentiment – Survey community members before and after launch to assess improvements in their perceived feeling of safety/comfort from Janitor AI protections. Positive trends validate the benefits.
  5. Toxicity Recurrence – Track specific policy violating users repeating poor behaviors pre & post launch. Janitor AI interventions combined with other warnings/discipline should steadily decrease repeat offense rates over time.

Producing tangible metrics evidencing Janitor AI effectiveness builds internal and external support for program expansion. The data also fuels constructive feedback loops for Anthropic to refine configurations and training. Lastly, sharing pilot outcomes publicly compounds learning across clients to accelerate industry-wide progress.

The Future with Janitor AI

Janitor AI represents the ethical application of advanced AI for positive social impact. As natural language processing and contextual reasoning capabilities rapidly advance, the future roadmap for Janitor AI includes:

  1. Expanded Language Support – Already operational in English, Janitor AI must progress to handle moderation across all major world languages to achieve necessary global scale.
  2. Image/Video Moderation – Harmful user generated content spreads beyond text including disturbing imagery. Janitor AI’s frameworks will grow to cover visual content analysis too.
  3. Inclusive Data Practices – Representation gaps in available training data skew performance reminding us that data isn’t neutral. Anthropic will lead in fair, transparent data management.
  4. Bespoke Configurations – Pre-built, one-size-fits-all rule sets cannot address diverse community needs well enough long-term. Highly customizable Janitor AI implementations must emerge.
  5. Moderator Health Prioritization – Exposure to the trauma, violence and toxicity inflicts mental health wounds for human reviewers daily. Janitor AI progress enables redirecting people into more fulfilling roles long-term while better supporting those who remain.

Technologies like Janitor AI feel dystopian upon first consideration but pragmatic analysis reveals tremendous potential for empowering human dignity, ethics and shared wellbeing if responsibly built. Our choices determine whether AI futures uplift or suppress humanity. Progress demands acknowledging risks while leading with hope, wisdom and care as Anthropic admirably has every step of the way.


Janitor AI pioneering a thoughtful path for artificial intelligence to help make online communities healthier, safer and more welcoming for all. Configurable content analysis at tremendous scales combined with interpretability and transparency makes Janitor AI a unique offering poised for responsible industry disruption.

Careful change management and measuring key success metrics allows smooth adoption while producing tangible safety, comfort and efficiency benefits from day one.

Ongoing expansions to handle more languages, content types and special use cases will solidify Janitor AI as a must-have solution allowing technology providers to finally get ahead of digital harassment and better serve users. But beyond commercial impacts, the work Anthropic is doing reflects technical innovation aligned with human values – an exemplar of ethical AI done right.

If you have any query, feel free to Contact Us!


What is Janitor AI?

Janitor AI is an artificial intelligence system created by Anthropic to moderate online content and keep communities positive. It uses natural language processing to detect and remove toxic, harassing, and harmful posts while explaining why.

How does Janitor AI work?

Janitor AI is implemented via API on websites and apps. It analyzes text, image, and video uploads against configured rules to find policy violations. Detected toxic content gets automatically removed or flagged while explanations are shown.

What content does Janitor AI analyze?

Janitor AI scans text-based content including forum posts, comments, messages, profiles, and more. Image and video analysis capabilities are in development. Audio moderation is not yet available.

What violations can Janitor AI detect?

Janitor AI detects hate speech, bullying, threats, identity attacks, sexually inappropriate content, self-harm risk, and general profanity/toxicity. Sensitive category analysis for protected groups is provided. Custom rules are configurable.

Can Janitor AI be customized?

Yes, clients can customize prioritization levels for different violation types, set enforcement actions based on severity, configure filters to enable/disable certain analysis features, and allow list exceptions.

How accurate is Janitor AI?

Accuracy rates depend greatly on training data quality and rule configurations. With enough community-specific data and tuning, Janitor AI typically achieves over 95% precision on primary use cases with continuous accuracy improvements over time.

Does Janitor AI have any biases?

Like all AI systems, Janitor AI can inherit unintended biases from imperfect training data not reflective of diverse populations. However, Anthropic utilizes responsible data collection and auditing practices to maximize fairness.

How does Janitor AI impact privacy?

Janitor AI only analyzes content data itself and does not access user accounts or identities. All processing is self-contained to enable privacy-preserving moderation without personal data exposure risk.

What volume of content can Janitor AI handle?

Janitor AI leverages cloud scalability to analyze millions of items per day. Actual throughput depends on configuration complexity but matches enterprise-scale content volumes across even the largest communities.

How do users appeal if wrongly flagged?

Platform owners provide internal appeal flows for users believing their content was incorrectly flagged. If Janitor AI incorrectly removed certain content upon appeal review, rule adjustments help improve accuracy.

Does Janitor AI replace human moderators?

Janitor AI augments human moderator capabilities for better scale, automation, and detection but does not fully replace the irreplaceable judgment of people. Humans still make final appeals calls, set enforcement policies, etc.

What are costs associated with Janitor AI?

Pricing is customized based on expected content volume, configuration complexity, and supported languages. Contact Anthropic sales for pricing details. Discounts are offered for research partners and high-value data providers.

How do I get access to Janitor AI?

Janitor AI is opening access through a phased rollout. Sign up on Anthropic’s website for the waitlist to get notified when available. Pilot opportunities also exist for select partners.

Leave a comment