Introduction
The "Extract Themes" and Machine Learning features in codeit work natively with over 100 languages.
This page lists the languages supported and the recommendations to follow when working with non-English languages or multilingual data.
Languages Supported
The table below shows the languages that are supported by the codeit AI.
Where a language is supported, the codeit AI can process data natively in this language without any further intervention required by the user.
Any languages that not listed are not supported by either the "Extract Themes" or "Machine Learning" features.
Where a language is not supported, the codeit AI cannot work natively with data in this language so it must be translated to a language that is supported. Data can easily be auto-translated from one language to another using codeit.
See here for instructions to auto-translate your data.
Language | Iso Code | Extract Themes | Machine Learning |
---|---|---|---|
Afrikaans | af | No | Yes |
Albanian | sq | Yes | Yes |
Amharic | am | Yes | No |
Arabic | ar | Yes | Yes |
Armenian | hy | Yes | Yes |
Azerbaijani | az | No | Yes |
Bashkir | ba | No | Yes |
Basque | eu | No | Yes |
Belarusian | be | No | Yes |
Bengali | bn | Yes | Yes |
Bosnian | bs | Yes | Yes |
Breton | br | No | Yes |
Bulgarian | bg | Yes | Yes |
Catalan | ca | Yes | Yes |
Cebuano | ceb | No | Yes |
Chinese (Literary) | lzh | Yes | Yes |
Chinese Simplified | zh-Hans | Yes | Yes |
Chinese Traditional | zh-Hant | Yes | Yes |
Chuvash | cv | No | Yes |
Croatian | hr | Yes | Yes |
Czech | cs | Yes | Yes |
Danish | da | Yes | Yes |
Dutch | nl | Yes | Yes |
English | en | Yes | Yes |
Estonian | et | Yes | Yes |
Filipino (Tagalog) | fil or tl | Yes | Yes |
Finnish | fi | Yes | Yes |
French | fr | Yes | Yes |
French (Canadian) | fr-CA | Yes | Yes |
French (French) | fr-FR | Yes | Yes |
Georgian | ka | Yes | Yes |
German | de | Yes | Yes |
Greek | el | Yes | Yes |
Gujarati | gu | Yes | Yes |
Haitian Creole | ht | No | Yes |
Hindi | hi | Yes | Yes |
Hungarian | hu | Yes | Yes |
Icelandic | is | Yes | Yes |
Indonesian | id | Yes | Yes |
Italian | it | Yes | Yes |
Japanese | ja | Yes | Yes |
Kannada | kn | Yes | Yes |
Kazakh | kk | Yes | Yes |
Korean | ko | Yes | Yes |
Latvian | lv | Yes | Yes |
Lithuanian | lt | Yes | Yes |
Macedonian | mk | Yes | Yes |
Malay | ms | Yes | Yes |
Malayalam | ml | Yes | Yes |
Marathi | mr | Yes | Yes |
Mongolian | mn | Yes | No |
Myanmar (Burmese) | my | Yes | Yes |
Norwegian | no | Yes | Yes |
Persian | fa | Yes | Yes |
Polish | pl | Yes | Yes |
Portuguese | pt | Yes | Yes |
Punjabi | pa | Yes | Yes |
Romanian | ro | Yes | Yes |
Russian | ru | Yes | Yes |
Serbian | sr | Yes | Yes |
Sicilian | scn | No | Yes |
Slovak | sk | Yes | No |
Slovenian | sl | Yes | Yes |
Somali | so | Yes | Yes |
Spanish | es | Yes | Yes |
Sundanese | su | No | Yes |
Swahili | sw | Yes | Yes |
Swedish | sv | Yes | Yes |
Tamil | ta | Yes | Yes |
Tatar | tt | No | Yes |
Telugu | te | Yes | Yes |
Thai | th | Yes | Yes |
Turkish | tr | Yes | Yes |
Ukrainian | uk | Yes | Yes |
Urdu | ur | Yes | Yes |
Uzbek | uz | No | Yes |
Vietnamese | vi | Yes | Yes |
Welsh | cy | No | Yes |
Yoruba | yo | No | Yes |
Using AI with multilingual data
Sometimes a coding Task can consist of verbatims in a mix of different languages.
For these types of projects, we recommend the following steps when using the codeit AI:
- It is better to code the data in the original languages if possible. This avoids problems with auto-translations.
The AI results in most languages are comparable to the results in English. - If the coders need to translate the data, please make sure the language is correctly flagged in the data using the Language datafield as the translations and AI results are more accurate when the language is properly identified.
- If no language data is available, then auto-detect can be used but the AI results will not be as accurate.