Python library for verb conjugation in French, Spanish, Catalan, Italian, Portuguese, and Romanian, enhanced by machine learning
[EN] Verbs completely conjugated: verb conjugations for French, Spanish, Portuguese, Italian, Romanian and Catalan, enhanced by machine learning
[CA] Verbs completament conjugats: conjugacions verbals per a francès, espanyol, portuguès, italià, romanès i català, millorades per l'aprenentatge automàtic
[ES] Verbos completamente conjugados: conjugaciones de verbos en francés, español, portugués, italiano, rumano y catalán, mejoradas por aprendizaje automático
[FR] Verbes complètement conjugués: conjugaisons des verbes français, espagnol, portugais, italien, roumain et catalan, à l'aide de l'apprentissage automatique
[IT] Verbi completamente coniugati: coniugazioni di verbi per francese, spagnolo, portoghese, italiano, rumeno e catalano, migliorate dall'apprendimento automatico
[PT] Verbos completamente conjugados: conjugações verbais para francês, espanhol, português, italiano, romeno e catalão, aprimoradas pelo aprendizado de máquina
[RO] Verbe complet conjugate: conjugări de verbe pentru franceză, spaniolă, portugheză, italiană, română și catalană, îmbunătățite de învățarea automată
- Quick Start
- Live Demo
- Example Output
- What's new in Verbecc 2.0
- Academic publications referencing Verbecc
- Typing - Parameter and Data Type Annotations
- Multi-Language Conjugation
- Multi-Language Conjugation using English mood and tense names via
localizationmodule - Credits
| Français / French | Català / Catalan | Español / Castellano / Spanish | Português / Portuguese | Italiano / Italian | Română / Romanian |
|---|---|---|---|---|---|
| Français / French | Català / Catalan | Español / Castellano /Spanish | Português / Portuguese | Italiano / Italian | Română / Romanian |
French être (to be) |
Catalan ser (to be) |
Spanish ser (to be) |
Portuguese ser (to be) |
Italian essere (to be) |
Romanian fi (to be) |
French se lever (to lift oneself) |
|||||
French ubériser (to "uberize") (unknown verb conjugated with ML template prediction)) |
- Multilingual
- Conjugate verbs in six romance languages: French, Spanish, Portuguese, Italian, Romanian, Catalan
- Includes Spanish voseo conjugation, with regional options in development.
- Predict conjugation of unknown verbs with 99% accuracy using machine learning techniques
- Conjugate thousands of known verbs without machine learning, using simple string transformations based on XML conjugation templates
- Complete
- Includes both simple and compound conjugations (i.e. with helping/auxiliary verbs)
- Includes alternate conjugations (for regional variations, e.g. Catalan vs. Valencian)
- Includes inflections for all genders where applicable
- Includes inlections for misc. pronouns such as the Spanish pronouns
ustedandustedesand the French pronounon.
- Quality
- Fully type-annotated python library
- Unit-tests require type-annotations on everything
- Typed return data
- Meticulously organized source tree
- Has a plethora of unit-tests to ensure correctness of verb conjugations
- Continuous Integration with GitHub Actions CI/CD pipeline
- CI tests python 3.9, 3.10, 3.11, 3.12, 3.13 and 3.14.
- Dependencies:
scikit-learn,scipy,numpy,lxml,pyaml,jsbeautifier,importlib_resources
- Fully type-annotated python library
- Trusted
- Cited in academic publications
git clone https://github.com/bretttolbert/verbecc.git
cd verbecc
pip install .| verbecc 1.x | verbecc 2.x |
|---|---|
lang='fr' |
lang=Lang.fr / from verbecc import LangCodeISO639_1 as Lang |
mood="indicatif" |
mood=Moods.fr.Indicatif / from verbecc import Moods |
tense="présent" |
tense=Tenses.fr.Présent / from verbecc import Tenses |
gender='f' |
gender=Gender.f / from verbecc import Gender |
person="1s" |
person=Person.First, number=Number.Singular / from verbecc import Person, Number |
| Conjugations include masculine pronouns (default) or feminine but not both | All pronouns, including both masculine and feminine third-person pronouns are included |
lang_specific_options is a parameter of the conjugate method |
lang_specific_options is a parameter of the CompleteConjugator class constructor |
gender is a parameter of the conjugate method |
there is no gender parameter, instead all possible gender inflections are returned |
alternate_options is a parameter of the conjugate method |
there is no alternate_options parameter, instead all possible conjugations, including alternates, are returned (use c[0] to get default conjugation, c[1] to get first alternate, etc.) |
Spanish Conjugations include tú (default) or vos but not both |
All pronouns, including both tú and vos are included |
Pronouns such as French on and Spanish usted/ustedes not included |
French on and Spanish usted/ustedes pronouns are included |
Array index is used to determine Person, i.e. 1s, 2s, 3s, 1p, 2p, 3p |
Each Conjugation object in the TenseConjugation has Person, Number and Gender values (any of which may be None if not-applicable) |
Returned objects are primitive (Dict) data types |
Returned wrapper objects are subclasses of AbstractConjugation (e.g. CompleteConjugation) with get_data() and to_json() methods |
Conjugator returns CompleteConjugationData |
CompleteConjugator returns wrapper type CompleteConjugation, CompleteConjugation.get_data() returns CompleteConjugationData |
| (no wrapper types) | Wrapper types hierarchy: CompleteConjugation > MoodsConjugation > MoodConjugation > TenseConjugation > Conjugation -> conjugations: List[str] |
Primitive data types hierarchy: Conjugation > MoodsConjugation > MoodConjugation > TenseConjugation > PersonConjugation |
Primitive data types hierarchy: CompleteConjugationData > MoodsConjugationData > MoodConjugationData > TenseConjugationData > ConjugationData -> conjugations: List[str] |
pred_score was always included in the output |
pred_score is only included in output if predicted is true |
| Only returned primitive Python data | Conjugation objects have both .to_json() and .to_yaml() methods |
Originally verbecc used strings for most parameters. verbecc is now fully type-annotated but strings are still supported for backwards-compatibility and ease of use. This is accomplished using StrEnum for parameters and by defining a hierarchy of typing type definitions for the returned data objects (See conjugation.py).
E.g.:
>>> from verbecc import grammar_defines, localization, Moods, Tenses, Person, Number, Gender, LangCodeISO639_1 as Lang
>>> xmood = localization.xmood
>>> xtense = localization.xtense
>>> grammar_defines.SUPPORTED_LANGUAGES[Lang.fr]
'français'
>>> xtense(Lang.fr, Tenses.en.Present)
<TenseFr.Présent: 'présent'>
>>> xmood(Lang.fr, Moods.en.Subjunctive)
<MoodFr.Subjonctif: 'subjonctif'>
>>> Gender.f
<Gender.f: 'f'>
>>> Number.Singular
<Number.Singular: 's'>
>>> Person.First
<Person.First: '1'>>>> from functools import partial
>>> from verbecc import CompleteConjugator, LangCodeISO639_1 as Lang, grammar_defines, Moods, Tenses
>>> ccgs = {lang : CompleteConjugator(lang) for lang in grammar_defines.SUPPORTED_LANGUAGES}
>>> print([c[0] for c in ccgs[Lang.fr].conjugate('être')[Moods.fr.Indicatif][Tenses.fr.Présent]])
['je suis', 'tu es', 'il est', 'elle est', 'on est', 'nous sommes', 'vous êtes', 'ils sont', 'elles sont']
>>> print([c[0] for c in ccgs[Lang.es].conjugate('ser')[Moods.es.Indicativo][Tenses.es.Presente]])
['yo soy', 'tú eres', 'vos sos', 'él es', 'ella es', 'usted es', 'nosotros somos', 'vosotros sois', 'ellos son', 'ellas son', 'ustedes son']
>>> print([c[0] for c in ccgs[Lang.ca].conjugate('ser')[Moods.ca.Indicatiu][Tenses.ca.Present]])
['jo sóc', 'tu ets', 'ell és', 'ella és', 'nosaltres som', 'vosaltres sou', 'ells són', 'elles són']
>>> print([c[0] for c in ccgs[Lang.pt].conjugate('ser')[Moods.pt.Indicativo][Tenses.pt.Presente]])
['eu sou', 'tu és', 'ele é', 'ela é', 'nós somos', 'vós sois', 'eles são', 'elas são']
>>> print([c[0] for c in ccgs[Lang.it].conjugate('essere')[Moods.it.Indicativo][Tenses.it.Presente]])
['io sono', 'tu sei', 'lui è', 'lei è', 'noi siamo', 'voi siete', 'loro sono']
>>> print([c[0] for c in ccgs[Lang.it].conjugate('essere')[Moods.it.Indicativo][Tenses.it.Presente]])
['io sono', 'tu sei', 'lui è', 'lei è', 'noi siamo', 'voi siete', 'loro sono']
>>> print([c[0] for c in ccgs[Lang.ro].conjugate('fi')[Moods.ro.Indicativ][Tenses.ro.Prezent]])
['eu sunt', 'tu ești', 'el e', 'ea e', 'noi suntem', 'voi sunteţi', 'ei sunt', 'ele sunt']Observe below that strings may be still used for mood and tense, rather than the Mood and Tense (StrEnum) types. E.g. indicative is interchangeable with Moods.en.Indicative and present is interchangeable with Tenses.en.Present.
>>> from verbecc import CompleteConjugator, localization
>>> def xconj(lang, infinitive, mood, tense):
m = localization.xmood(lang, mood)
t = localization.xtense(lang, tense)
cc = CompleteConjugator(lang).conjugate(infinitive)
return [c[0] for c in cc[m][t]]
>>> xconj('fr', 'etre', 'indicative', 'present')
['je suis', 'tu es', 'il est', 'elle est', 'on est', 'nous sommes', 'vous êtes', 'ils sont', 'elles sont']
>>> xconj('es', 'ser', 'indicative', 'present')
['yo soy', 'tú eres', 'vos sos', 'él es', 'ella es', 'usted es', 'nosotros somos', 'vosotros sois', 'ellos son', 'ellas son', 'ustedes son']
>>> xconj('pt', 'ser', 'indicative', 'present')
['eu sou', 'tu és', 'ele é', 'ela é', 'nós somos', 'vós sois', 'eles são', 'elas são']
>>> xconj('ca', 'ser', 'indicative', 'present')
['jo sóc', 'tu ets', 'ell és', 'ella és', 'nosaltres som', 'vosaltres sou', 'ells són', 'elles són']
>>> xconj('it', 'essere', 'indicative', 'present')
['io sono', 'tu sei', 'lui è', 'lei è', 'noi siamo', 'voi siete', 'loro sono']
>>> xconj('ro', 'fi', 'indicative', 'present')
['eu sunt', 'tu ești', 'el e', 'ea e', 'noi suntem', 'voi sunteţi', 'ei sunt', 'ele sunt']- Created with the help of scikit-learn, lxml, pytest and python
- French verb conjugation template XML files derived from Pierre Sarrazin's C++ program Verbiste.
- Conjugation XML files (Verbiste format) for Spanish, Portuguese, Italian and Romanian and machine-learning conjugation template prediction for unknown verbs dervied from Sekou Diao's older project mlconjug however they have a newer version out now: mlconjug3
- Catalan verbs list imported from catverbs