Assessing your Language Level

Challenges of Accurate Assessment

In sports, athletes compete against each other to determine relative skill levels. Each sport has its own ranking system, and these rankings are updated depending on the individual’s or team’s performance. Ranking systems have sophisticated algorithms that might take into account variables such as home advantage, strength of schedule, tournament difficulty, diminishing returns for noncompetitive games, and among others. With the help of modern computing power and complex ranking systems, people are becoming more accurate in predicting the winners. But the beauty of sports lies in the element of randomness and unknown, incalculable variables. Systems can compute the odds of winning, but they cannot establish certainty. Luck plays a factor, according to Michael Mauboussin. In Think Twice, he explains that the more players (=more interactions), the more number of games per season, the more luck impacts the rankings. That’s why sports are exciting. That’s why people will continue rooting for the underdog. 

Unlike sports, language isn’t based on competition. Language assessment is highly biased and subjective because language is dynamic (accent, word choice, speech patterns, personality, identity), social (conversation partner, group discussions, power), and contextual (casual, professional, academic). 

A single method isn’t exhaustive enough, but using all the available tools can help create a more accurate picture of language level.



I have taught students who were incredible communicators in professional settings but were unable to have casual conversations. I have had students who crushed standardized tests but lacked conversational fluency, and vice-versa, those who scored poorly on tests but had incredible fluency. The bottom line is that language skill comes in all shapes and sizes. Finding a single aggregate score is practical for businesses, schools, and individuals, but, at the same time, simplifies reality. 

There is no single assessment that accurately captures a language learner’s skill level. The best approach is to examine multiple subjective and objective, internal and external sources. Each assessment has its pros and cons. Below are some of the problems with each method:  

      • Standardized Language Tests: Whether students take the TOEIC or TOEFL, each test examines language in a specific context. The TOEIC examines business English, focusing primarily on reading, grammar, and listening comprehension, whereas the TOEFL tests academic English and assesses all subskills. These tests measure not only language ability but also test-taking and academic ability. Other problems include inauthentic materials, technical issues, administrative errors, improper seating, and disruptions. Lastly, standardized tests don’t take into account the impact of geography and economics; for example, people from different cultures have different comfort levels with computerized tests and audiovisual materials. 
      • Language School Assessment: Schools administer level assessment tests to decide class allocation. Instead of re-inventing the wheel, they use online level tests, which have varying levels of credibility and don’t follow any specific standard. Leveling up decisions are not objective either, teachers make those assessments based on experience, intuition, skill assessment tests, self-interest, favoritism, school politics, and progress reports. Furthermore, these assessments are relative: students are compared to other students in the same school and not an objective standard. In the end, language schools are businesses, with all sorts of incentives to control how often and how fast students level up.
      • Expert Teacher Assessment: A teacher’s assessment of their students is subjective, contributed by cognitive biases, memory biases, social norms (e.g., sugarcoating, indirect negative feedback), self-interest, and intuition. In Moneyball, Michael Lewis recounts the paradigm shift that occurred in baseball in the early 2000s. Instead of scouting players based on expert intuition, teams began supplementing their decision-making with an analytics-driven approach to find talent, ultimately revolutionizing the scouting process. Similarly, teachers who rely solely on intuition will fail. 
      • Certification: Certifications handed out by language schools or online courses lack credibility. There’s no way to compare language levels between these entities because each agency has its own criteria for assessment. Language schools also suffer from internal politics, and they award certifications not based on performance, but mostly based on who spends the most money. 
      • Years and Hours of Study: Neither is an accurate indicator of skill. One hour of input doesn’t result in the same “amount” of improvement. Factors that influence progress include motivation, studying habits, learning strategies, stage of learning, quality of practice, feedback, language aptitude, environment, cultural knowledge, and a whole host of other variables, not to mention the impact of geography and economics. Students also rarely count the number of hours they invest in their studies. 
      • Native Speaker Assessment: Native speakers run the gamut from highly eloquent to painfully incompetent. Just because someone is a native speaker doesn’t mean that he or she has a high language ability, or that he or she can assess someone’s language level. Students with a wider working vocabulary than native speakers aren’t rare. They are native-like, but their subtle accents disclose their nonnative origin. Besides, people have a social incentive to remain polite and not cause offense, resulting in vague, sugarcoated analysis: “Your English is great!”
      • Self-assessment: Self-evaluation makes more sense for advanced students because beginner and intermediate students lack the skill to recognize their own incompetence. Students who compare themselves to their peers because of ease and availability must understand that the comparison is relative. Instead, they should measure themselves using an absolute standard such as the American Council on the Teaching of Foreign Languages (ACTFL) or the Common European Framework of Reference for Languages (CEFR). Students can go to the official ACTFL website to see examples of how students perform at different levels for different subskills. 

Until a new system is introduced to measure language level, companies and schools will continue to emphasize language tests as the most accurate determiner of skill because practicality trumps accuracy. Standardization creates one single aggregated metric, enabling schools to make easy comparisons and allowing students to set clear goals and measure their progress. But do standardized tests accurately access language ability? The answer is obvious at this point. Determining language proficiency is challenging. A single method isn’t exhaustive enough, but using all the available tools can help create a more accurate picture of language level—including the limitations and nuances of one’s ability. Perhaps, advanced algorithms and AI can solve this problem in the future.