How I learn new vocabulary with parallel texts


In response to my last post, someone asked how I’m currently learning more vocabulary, and my response started to grow past normal comment size, so I figured I’d make it a full post.

What I do is I try to have as many moments of recognition as I can. These are moments where some new word in the foreign language somehow becomes understandable or comprehensible. For instance, I see a Dutch word in my parallel text that I don’t know, so I look across at the matching English sentence and figure out what the mystery word means. This gives me a moment where I recognize that new word. By continually adding new learning moments such as this, my vocabulary increases.

This is part of the natural absorption process as you acquire a language. Each small moment of comprehension adds to your neural networks that are being unconsciously constructed. This is training material for your brain. Instead of trying to explicitly memorize a table or a list that needs to be consciously recalled (which is a slow access method), you’re instead building a net that gives you very fast subconscious recognition. Small moments of comprehensible input are the building blocks for these nets.

If I do this enough each day through my reading time, then I’ll get some repetitions for each of the words, which means I don’t have to use SRS. I usually try and purposely go over the same section of a book later in the day, to specifically repeat any words I saw before. If I were learning the language less intensely, I’d be adding sentences to Anki instead, so that I could get the right repetitions at the right time in order to solidify it, otherwise I might not see it again in time naturally. However, with 5 – 10 hours of exposure per day, I don’t think this is necessary.

Anki is actually quite a good supplement. Vocabulary is one of the few language features where it dramatically helps to “artificially” cram your head full of new items. More grammar rules don’t really help you speak at a normal pace (because a “rule” is something that must be explicitly recalled, and is therefore slow), but more vocabulary recognition actually does help you, because repeated exposure to new words in the context of a sentence that you’ve already seen somewhere, means that it is an exposure that is building subconscious recognition instead of just explicit slow-recall.

To the commenter, Dustin, I suggest that you continue with Anki, but delete any cards that cause you too many problems. Don’t get trapped in the attitude that every single word must be added. There’s a lot of time that can be wasted on cards that are just “hard” and never seem to get easier. Also, they can build frustration, leading to you not spending as much time on your reviews as you might have. The solution is to enthusiastically delete cards that cause you problems. It’s ok, you’ll see those words eventually in some other context, and it’ll be easier then.

There are lots of ways to quickly acquire more vocabulary, and I recommend that people focus heavily on vocabulary specifically when starting a new language, because those first 500 or so words can lead to tremendous amounts of understanding, even without grammar. The method that I prefer for this, though, is just reading a parallel text. Sure, I might not understand any of the new language on one side of the page, but with the assistance of the English section, I can quickly find correspondences for the most common words. In some cases, it’s possible to reach 70% word recognition in a text within the first day!

In summary, I suggest finding ways to make new words at least slightly more comprehensible, and then just do it often. You can even learn a lot just by seeing those common words a lot…just moving your eyes over a lot of unknown words will give you a sense for which words are most common, and which other words they tend to be beside. These are important steps on your way to learning the full meaning of those words. Therefore, simple reading can be one of the best ways to learn new vocabulary, even if you’re very new at a language.


Dutch update: vocab self-test (91 hrs)


I just did another vocabulary self-test. This time I used a 704-word selection from somewhere in the middle of the 2nd Stieg Larsson book. I chose this book because I know there’s some pretty advanced vocab in it, much more than in Harry Potter.

Out of 704 words, I had good knowledge of 678 words, giving me a score of 96.4%. I also had good comprehension of the text…in fact it felt nice to read, so I might be able to make an attempt at the airplane test” soon, which was one of my stated goals. This somewhat surprised me, since in the past few days it’s felt like I’ve been making zero progress, despite getting dozens of study hours in. The problem is just that the overal percentage recognition is only going up a tiny percent, so it’s hard to notice without computing some statistics like this. Therefore, for further projects I think I’ll administer these self-tests more often, to keep up my motivation.

Another bit of motivation was to write down all the unrecognized words and look them up afterwards. I noticed that there were several “unknown” words that I should have guessed from German, such as “onderzocht” (untersucht), “buik” (Bauch), bestaan (bestand), etc. This means that there’s still plenty of low-hanging fruit left to pluck, if I keep working at it.

Since I’m currently at about 400000 words read, I’m now pretty confident that once I hit 1 million words read, I’ll be at a very satisfactory reading level. This mirrors my experience with German, where I was already at quite a decent level of comprehension by the 400000 word mark, and quite happy with my results after 1 million words.

What’s the best textbook for learning German?


this is a response to a question on the HTLAL forums about how to get started at German, from scratch

My recommendation is to focus on vocab and listening at the start, and gradually move into more and more reading (especially with audiobooks to go with the books).

At the start, you need to do a lot of listening in order to grasp the sound system and the rhythm of the language. Learn to love the sounds of it, and try to imitate it. You also need to rapidly learn the basic vocab so that you can start to understand some real sentences. A brief glance at some grammar examples will probably also help you to piece things together, but there’s no need to memorize any tables or anything.

For vocab, it can be quite handy to use some of those little phrasebooks. I’ve looked at a lot of German phrasebooks and compared, and I think that one of the best is the Kauderwelsch “German, word by word” phrasebook. There are actually a lot of nice explanations in it, and they do a word-by-word translation of all the phrases, in addition to the regular English translation. Another one that’s good just for sheer number of words, is the Lonely Planet German phrasebook.

You can also try downloading some of the shared decks in Anki, and working through those.

Ok, so the next step (or even simultaneous step) is to move into native materials, especially books. I recommend Harry Potter, since it’s fairly easy as novels go, and there’s a great audiobook. Rufus Beck reads the German audio version, and he’s fantastic! The problem for you is that at the start, you won’t know many of the words. You can balance this out a bit by spending more time at the start doing some lookups, but I also encourage you to just listen and read, even if you don’t get it all. You’ll get a lot from the voice-acting that Beck does, and from the surrounding words that you already understand from their relatives in English.

If you sit back and try to enjoy the book as much as you can, you’ll get into it a bit more and you’ll start getting partial meanings of the words from context. From the little bits and pieces that you get, you’ll be able to get more and more of the story. Keep a highlighter pen around for the words that you see multiple times and you really want to know. Just highlight it, and keep reading, and then you can go back later and look them all up at once and put them into Anki or some other flashcard program.

Last year I did something like this for several months. At the start, I hardly understood any of Harry Potter, and I also didn’t get much of the TV shows I was watching. By the time I got to book three in the Harry Potter series, I actually had begun to understand quite a bit. When I got to book 5 I understood almost everything.

The thing that’s nice about the audiobook is that it’ll keep pushing you through the text. Instead of going super slowly and getting stuck on every word, you’re pushed to try to make sense of the general story, and you get much more exposure to the language. You can go back and look up some of the words, but your desire to find out what happens in the story will keep you going back to the audiobook to find out.

Now, keep in mind that this is all passive. When I first got to Germany, I could read a real novel and understand almost everything, but I still spoke mostly like a beginner in terms of my expressive ability. At some point, you’re going to have to decide to start trying to speak, and there are differing preferences on when to do this. Some people prefer to start right away, but since you’re not coming to the country for a while then it should be fine if you decide to wait until you have high comprehension (because then you’ll have the handy ability to tell which things “just sound right” to you).

Above all, the most important thing is to find stuff that’s interesting to you. It doesn’t matter if everyone in the world rates a certain textbook as “super awesome” if you find it boring, because then you won’t continue with it. For most people, “interesting” usually equates with actual real native material such as books and movies, so then your task is to shoot through as much basic vocab as you can so that you can jump into native materials sooner. And don’t be afraid to use the native materials as your guide of which words to learn. You can learn the words as you come across them.

wordlists and core vocab


from this thread on HTLAL

In French something like 25 verbs make up over 50% of the verb forms in ordinary spoken speech. My statistics may be off, but the point is clear. Instead of trying to learn 500 French verbs, master the 25 first and then progressively work your work through the others as they come up. For these very reasons I believe that with a vocabulary of 1000 words well learned one could get by very well in French and probably fool a lot of people.

The problem with these percentages is that even if you know those 25 words, and they come up in every sentence, you still won’t understand those sentences as they are spoken to you. Also, once you add in some more specific (but less frequent) words that help you in a couple of everyday situations, then the number starts to shoot upwards. Having a low limit like 1000 is a difficult task.

In principle, though, I mostly agree. There is really a core of the language that you need to master and have it always ready. If you can fluidly produce the basic things from that core, then it becomes an easy task to learn another 20 – 100 new words in a short time period in order to deal with a new potential situation.

I think it’s possible to go the other way around, though. Taking what Iverson said earlier about learning many many more words right at the start, I’m starting to imagine that one should actually do this backwards. Instead of learning the core really well and then expanding your vocab later, you could learn tons of vocab as fast as you can and then use your extensive vocabulary superpowers to read and listen to tons of native material that would help you cement the core parts.

I think this relates well to the idea of having a good balance of “intensive” and “extensive” reading, but I’ll have to think more about just concentrating on massive vocab, which is a slightly different path than intensive reading (which is more well-rounded, not focusing entirely on vocab).

This relates to my current Swedish project quite well, because I have a wonderful frequency-based wordlist of 2000 common words that each have an example sentence. I keep thinking that I’m not using this list to its full potential, since I’ve only made flashcards for the “A” up to the “E” words so far. It’s just much easier to stay interested if I’m reading a real book instead of playing with a wordlist. It does look like my ability to read would be greatly increased if I spent more time on the list first, though. Maybe I just need more hours in the day 😉

Overall, the importance should rest on finding something fun, but if you can manage short bursts of interest in something like a wordlist, then perhaps it would be worth it if it then enhanced your enjoyment of the really fun stuff. Don’t overdo it though, or else it’ll start to seem too much like a chore instead of your super-fun hobby!

vocab vs. grammar


There was a lively debate on HTLAL this past week about whether grammar or vocabulary is more fun to learn. Actually, hardly a debate since it seemed to be more of an expression of personal preferences and a listing of enjoyable ways that different people studied. One thing that got me thinking, however, was the way the discussion only discussed learning individual vocabulary words or learning discrete formulaic grammar rules. To me, language is much more subtle than that and there are many connections and layers to it.

I agreed with one of the commenters, who said that most kids and adults can speak well without understanding the “underlying” grammar rules, but I think there’s a problem here. I don’t believe that the grammar rules are “underlying”. I’d actually say the opposite, that the rules are closer to “overlaying”. Grammar rules are an artificial construction and are not necessary for learning the language. They are incomplete, underspecified, and mostly just an attempt at description, but we as language learners tend to put a lot of importance into them sometimes.

The rules for how a language works are usually more complex than the pieces we recognize as “grammar”. It’s an interesting task to try to describe a language with a set of rules, but the rules become too cumbersome if we try and include all of the features. They end up just being a list of exceptions. Rules just “feel” better when they are parsimonious. If we can make them as simple as possible without being useless, then they feel more mathematically pure and satisfying to us. Parsimony is a wonderful principle to strive for in descriptions, and grammar rules can be useful in many ways, but we can’t get distracted by believing that they *are* the language. There’s more to it. “The map is not the territory.”

In real language, even if you’ve mastered these rules and try to produce some sentences in accordance with them, you’ll find that only a subset of the results are actually “correct”. There’s another layer at work, with acquired experience determining which of the grammatically correct sentences are actually still valid in the language. One example I can think of right now is that in English we have the phrase “lethal injection”, but we can’t say “deadly injection” or “mortal injection”. In one sense, they are sort of correct and everyone will understand them, but to an experienced speaker of the language they just aren’t acceptable.

So, although grammar is sometimes interesting to me, and vocabulary is more interesting, this other mysterious level of the language is what I’m most interested in investigating. It’s like memorizing hundreds of digits of Pi…there are many little interconnected intricacies that defy generalized patterns, but you can make up your own little patterns to help you as you go along. Those little made-up patterns you find are not Pi, they are your own creations. They don’t reflect the entire number, or define it. But they can be very helpful to you in their own way.

In computational linguistics, when trying to get a computer program to understand language at some level, there’s been a big push to develop statistical strategies rather than relying on pre-formed grammar rules. The grammar rules always have too many exceptions and are hard to manually program in, but using statistical methods you can “feed” your program more examples and have it get closer to the real language. I actually think this matches more closely how humans learn too. Instead of specifically studying grammar and vocab as individual items, I enjoy it more if I feed my brain on multiple levels simultaneously by trying to understand examples however I can.

I’m also really interested in “everyday” and casual speech. It’s so hard to learn from books, because it’s hard to make those simple rules to describe it. You have to learn it from examples in real life and absorb it, since a lot of it is more like verbal customs that are acceptable rather than mathematical rules. Those customs can often change quickly, and are different in different locations, so they wouldn’t necessarily help sell a book well. At this level, the language is hard to make rules about, and is hard to commodify. You just have to jump in and experience it.

october progress spreadsheet



October was an interesting month. At the start I had a small slump where I was just doing a small amount each day to keep up, but I managed to turn that around and increase again. I changed my balance slightly and did less tv, but more reading. This was partially due to getting a bit bored of star trek after over 100 episodes, so I have to change things around there if I want to keep watching more TV.

I learned that in this intermediate stage of my learning, it’s quite helpful to spend a week doing some hard vocab work (in which I added lots of example sentences to Anki from my “Mastering German Vocabulary” book). This extensive vocabulary work allowed me to push through to a stage where I understand most of the words on the page quite easily, and there are only a few words that I don’t know.

For the next month, I hope to continue increasing my vocabulary in some specific areas like science, economics, and politics. I also plan to start doing some basic speaking practice on my own, and I’m hoping to develop a better ability to think in German.

don’t just learn a giant list


Here’s a comment I recently made in a forum on The thread was about whether or not it would be beneficial to try and memorize the 4000 most frequently used words in a language as a strategy for learning it.

I think what Parasitius was saying is that if you hope to learn a language through pure flashcards of important vocabulary, you will bore yourself to death, but if you combine it with reading enjoyable native materials, then it can be extremely helpful.

This has been my experience for sure. At different times I vary the percentages, but I like Parasitius’ estimates of 20% SRS and 80% reading. Also, Iverson has given some good advice on this too, saying that his wordlists are for giving him just a general sense of a word’s meaning, but it’s really reading that gives him all the multiple meanings and real usages of the words. Flashcards or wordlists will never teach you all the subtleties of usage.

Also, I recommend avoiding the idea that you can “scientifically” learn vocabulary “in order”, focusing on “completeness”. Although those things appeal to me, having a background in math and computers, I feel that this mindset is a bit of a dead end for language learning. Instead, I tell myself that I will need to experience each word multiple times in its “natural environment” before I’ll really understand it, and my flashcard work is merely “prep time” that will get me ready for the real thing.

In my mind there are several stages of “knowing” a word. At first, I might see a word a few times in books and I sort of recognize it in the sense of “hey, I’ve seen that before somewhere”. Next, I might look it up once, and get a general sense of the meaning, but I tend to forget it again soon unless I add it to Anki (my SRS of choice). As I keep reading my novels and seeing these new words several times, the word evolves from “huh?” to “oh ya, I recognize that”, to “I know the translation for that” to “I know the meaning without translating” and then to “I can use it with ease in speech”.

I’ve found that the key to moving along this path is just repeated exposure. If you’re really worried at the start that you need to collect 4000 common words and become an expert at all of them, I think you’re going in the wrong direction. Just consistently investigate words as you encounter them, and your vocabulary will grow over time. Curiousity and diligence, that’s all.

When reading, you don’t need to highlight EVERY word on the page that you don’t know. Just pick the two that are most interesting. You’ll see the other ones again eventually; you won’t “lose” them or anything, they’ll still be around later in another book or magazine or movie. As long as you’re somehow improving every day, then that’s enough.