 
   Speech and Language Resource Bank
    
    
	
        
	        
		 A corpus of European Parlimentary speech and tools for machine learning models. 
                Authors:                                           Chanhan Wang, 
                                        
Morgane Riviere, 
                                        
Ann Lee, 
                                        
Anne Wu, 
                                        
Chaitanya Talnikar, 
                                        
Daniel Haziza, 
                                        
Mary WIlliamson, 
                                        
Juan Pino, 
                                        
Emmanuel Dupoux
 
                Updated:  2021-04-30 
            
		
Source:  https://aclanthology.org/2021.acl-long.80/
                Keywords:                                          English,  
                                        
German,  
                                        
French,  
                                        
Spanish,  
                                        
Polish,  
                                        
Italian,  
                                        
Romanian,  
                                        
Hungarian,  
                                        
Czech,  
                                        
Dutch,  
                                        
Finnish,  
                                        
Slovak,  
                                        
Slovenian,  
                                        
Estonian,  
                                        
Lithuanian,  
                                        
Portuguese,  
                                        
Bulgarian,  
                                        
Greek,  
                                        
Latvian,  
                                        
Maltese,  
                                        
Swedish,  
                                        
Danish,  
                                        
speech synthesis,  
                                        
machine learning,  
                                        
Accented Speech 
         
        
	
        
	        
		 Python 3.7 resources to evaluate bigram and trigram frequencies in corpora. 
                Authors:                                           Jeroen van Paridon, 
                                        
Bill Thompson
 
                Updated:  2019-04-30 
            
		
Source:  https://github.com/jvparidon/subs2vec
                Keywords:                                          language,  
                                        
bigram,  
                                        
trigram,  
                                        
lexical norms,  
                                        
psycholinguistics,  
                                        
Afrikaans,  
                                        
Arabic,  
                                        
Bulgarian,  
                                        
Bengali,  
                                        
Breton,  
                                        
Bosnian,  
                                        
Catalan,  
                                        
Czech,  
                                        
Danish,  
                                        
German,  
                                        
Greek,  
                                        
English,  
                                        
Esperanto,  
                                        
Spanish,  
                                        
Estonian,  
                                        
Basque,  
                                        
Farsi,  
                                        
Finnish,  
                                        
French,  
                                        
Galician,  
                                        
Hebrew,  
                                        
Hindi,  
                                        
Croatian,  
                                        
Hungarian,  
                                        
Armenian,  
                                        
Indonesian,  
                                        
Icelandic,  
                                        
Italian,  
                                        
Georgian,  
                                        
Kazakh,  
                                        
Korean,  
                                        
Lithuanian,  
                                        
Latvian,  
                                        
Macedonian,  
                                        
Malayalam,  
                                        
Malay,  
                                        
Dutch,  
                                        
Norwegian,  
                                        
Polish,  
                                        
Portuguese,  
                                        
Romanian,  
                                        
Russian,  
                                        
Sinhala,  
                                        
Slovak,  
                                        
Slovenian,  
                                        
Albanian,  
                                        
Serbian,  
                                        
Swedish,  
                                        
Tamil,  
                                        
Telugu,  
                                        
Tagalog,  
                                        
Turkish,  
                                        
Ukranian,  
                                        
Urdu,  
                                        
Vietnamese