Content analysis is a research tool used to determine the presence of certain words, topics or concepts within certain qualitative data (i.e. a text). Through content analysis, researchers can quantify and analyze the presence, meanings, and relationships of those words, topics, or concepts. For example, researchers may evaluate the language used in a newspaper article to look for bias or bias. Thus, researchers can make inferences about the messages of the texts, the writer or writers, the audience and even the culture and time surrounding the text.


Data sources can come from interviews, open-ended questions, field research notes, conversations, or literally any emergence of communicative language (such as books, essays, debates, newspaper headlines, speeches, media, historical documents). According to Krippendorff (1980), thesame study can analyze various forms of text in its analysis. To analyze text through content analysis, the text must be encoded, or broken down, into categories of code manageable for analysis (i.e., “codes”). Once the text is encoded into code categories, the codes can in turn be classified into “code categories” to further summarize the data.

Below are two different definitions of content analysis.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.”

Definition 2: “An interpretive and naturalistic approach. It is observational and narrative in nature and is based less on the experimental elements normally associated with scientific research (reliability, validity and generalizability).

Uses of Content Analysis

Determine the psychological or emotional state of individuals or groups

Reveal patterns in communication content

Test and improve an intervention or survey before it is launched

Analyze focus group interviews and open-ended questions to complement quantitative data.

Types of content analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops conceptual analysis by examining the relationships between the concepts in a text. Each type of analysis can lead to different results, conclusions, interpretations, and meanings.

Conceptual analysis

Normally, people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen to examine and the analysis consists of quantifying and accounting for its presence. The main objective is to examine the presence of the selected terms in the data. Terms can be explicit or implied. Explicit terms are easy to identify. The codification of implicit terms is more complicated: the level of involvement must be decided and judgments based on subjectivity (a question of reliability and validity). Therefore, coding implicit terms involves the use of a dictionary or contextual translation rules, or both.

According to Berelson (1952), to begin a conceptual content analysis, one must first identify the research question and choose one or more samples for analysis. Next, you need to encode the text into manageable content categories. It is basically a process of selective reduction. By narrowing the text down to categories, the researcher can focus on specific words or patterns that inform the research question and encode them.

General steps to perform a conceptual content analysis

Decide the level of analysis

Word, word sense, phrase, sentence, topics

Decide how many concepts to code

Develop a predefined or interactive set of categories or concepts. 3. Decide whether: A. allow flexibility to add categories throughout the coding process, or B. stick to the predefined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications for your research question.

Option B allows the researcher to stay focused and examine the data for specific concepts

Decide whether to encode the existence or frequency of a concept. The decision changes the coding process

When encoded by the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

By encoding the frequency of a concept, the researcher would count the number of times a concept appears in a text.

Decide how you will distinguish the concepts

Should text be encoded exactly as they appear or encoded as the same when they appear in different forms? For example, “dangerous” versus “dangerous.” It’s about creating coding rules so that these word segments are classified transparently and logically. Rules can make all of these word segments fall into the same category, or perhaps rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of involvement is allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” versus “the person is scary” versus “that person could cause me harm.” These segments of words may not merit separate categories, due to the implicit meaning of “dangerous.”

Develop rules for coding your texts

Once the decisions in steps 1 to 4 have been made, the researcher can begin to develop rules for translating the text into codes. This will keep the coding process organized and consistent. The researcher can encode exactly what he wants to code. The validity of the coding process is guaranteed when the researcher is consistent and consistent in his codes, which means that he follows his translation rules. In content analysis, obeying translation rules equals validity.

Decide what to do with irrelevant information

Should it be ignored (e.g., common English words like “the” and “and”), or used to re-examine the coding scheme in case it adds something to the encoding result?

Encode the text

This can be done by hand or by software. By using the software, researchers can enter the categories and have the software program perform the coding automatically, quickly and efficiently. When coding is done by hand, the researcher can recognize errors more easily (e.g., typos or spelling errors). If computer encoding is used, the error text can be cleaned to include all available data. This decision of manual coding versus computer science is more relevant to implicit information, where the preparation of categories is essential for accurate coding.

Analyze your results

Draw conclusions and generalizations when possible. Determine what to do with irrelevant, unwanted, or unused text: Reexamine, ignore, or reevaluate the encoding scheme. Interpret the results carefully, as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational analysis

Relational analysis begins as conceptual analysis, in which a concept is chosen to examine. However, analysis involves exploring the relationships between concepts. It is considered that individual concepts do not have an inherent meaning, but that meaning is the product of the relationships between concepts.

According to Fielding and Lee (1991), to begin a relational content analysis, you must first identify a research question and choose one or more samples for analysis. The research question should be focused so that the types of concepts are not open to interpretation and can be summarized. Then select the text for analysis. Select the text for analysis carefully, balancing the fact that you have enough information for a thorough analysis, so that the results are not limited, with the fact that you have too extensive information, so that the coding process becomes too arduous and heavy to provide meaningful and valuable results.

There are three subcategories of relational analysis that you can choose from before moving on to the general steps.

Extraction of affections

An emotional evaluation of explicit concepts in a text. A challenge of this method is that emotions can vary over time, populations, and space. However, it could be effective in capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis

Evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called “window” that is explored in search of the co-occurrence of concepts. The result is the creation of a “matrix of concepts”, that is, a group of interrelated concepts that suggest a global meaning.

Cognitive mapping

A visualization technique for the extraction of affections or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of text, such as a graphical map that represents the relationships between concepts.

General steps to perform a relational content analysis

Determine the type of analysis: Once the sample is selected, the researcher must determine what types of relationships he will examine and the level of analysis: word, word sense, phrase, sentence, topics.

Reduce the text to categories and encode the words or patterns. The researcher can encode the existence of meanings or words.

Explore the relationship between concepts: once the words are encoded, the text can be analyzed based on the following

Strength of the relationship: degree of relationship between two or more concepts.

Sign of the relationship: the concepts are positively or negatively related to each other.

Relationship direction: types of relationship presented by categories. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the main motivator of Y.

Coding relationships

One difference between conceptual and relational analysis is that the statements or relationships between the concepts are codified.

Perform statistical analysis: explore the differences or look for relationships between the variables identified during coding.

Develop representations: such as decision maps and mental models.

Reliability and validity


Due to the human nature of researchers, coding errors can never be eliminated, but only minimized. Usually, 80% is an acceptable margin of reliability. The reliability of a content analysis is based on three criteria:


Tendency of coders to systematically recode the same data in the same way over a period of time.


The tendency of a group of coders to classify category membership in the same way.


The degree to which the classification of the text corresponds to a statistical standard or norm.


The validity of a content analysis is based on three criteria:

Approximation of categories

It can be achieved by using multiple classifiers to arrive at a consensual definition of each specific category. Using multiple classifiers, a conceptual category that can be an explicit variable can be expanded to include synonyms or implicit variables.


What level of involvement is permissible? Do the conclusions correctly follow the data? Are the results explainable by other phenomena? This is especially problematic when computer programs are used for analysis and a distinction is made between synonyms. For example, the word “mine” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which the ore is extracted. Computer programs can get an accurate count of the occurrence and frequency of that word, but they cannot account for the meaning inherent in each particular use. This problem can distort the results and make any conclusions invalid.

Generalization of the results to a theory: it depends on the clear definitions of the categories of concepts, how they are determined and their reliability to measure the idea to be measured. Generalizability parallels reliability, as it depends largely on all three reliability criteria.

Benefits of content analysis

Directly examine the communication using text

Enables qualitative and quantitative analysis

Provides valuable historical and cultural insight over time

Allows an approach to the data

The encoded form of the text can be statistically analyzed

Discrete medium for analyzing interactions.

When done right, it is considered a relatively “accurate” research method.

Content analysis is an easily understandable and inexpensive research method.

It is a more powerful tool when combined with other research methods, such as interviews, observation, and the use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of content analysis

May be time-consuming

It is subject to a greater number of errors, especially when relational analysis is used to reach a higher level of interpretation.

It often lacks theoretical basis, or tries too freely to make meaningful inferences about the relationships and impacts implicit in a study.

Also, It is inherently reductive, especially when it comes to complex texts.

It is usually limited to word count.

It often does not take into account the context in which the text was produced, nor the state of affairs after the text was produced.

Ultimately, It can be difficult to automate or computerize.

Our specialists wait for you to contact them through the quote form or direct chat. We also have confidential communication channels such as WhatsApp and Messenger. And if you want to be aware of our innovative services and the different advantages of hiring us, follow us on Facebook, Instagram or Twitter.

If this article was to your liking, do not forget to share it on your social networks.

Bibliographic References

Berelson, Bernard. Content Analysis in Communication Research.New York: Free Press, 1952.

Krippendorff, Klaus. Content Analysis: An Introduction to its Methodology. Beverly Hills: Sage Publications, 1980.

Fielding, NG & Lee, RM. Using Computers in Qualitative Research. SAGE Publications, 1991. (Refer to Chapter by Seidel, J. ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’.)

You may also be interested in: Causal Effect

Content Analysis

Content Analysis

Abrir chat
Scan the code
Bienvenido(a) a Online Tesis
Nuestros expertos estarán encantados de ayudarte con tu investigación ¡Contáctanos!