Testing the classifier across 30 languages
181 test cases across 4 difficulty tiers
Clear language markers and distinctive syntax
Minimal distinctive features, could be multiple languages
Misleading comments, embedded code, cross-language styles
Polyglot code valid in multiple languages, ambiguous snippets