Library

store it as our property, and since this parser  is known as recursive descent parser - what it  
means - it starts from the main entry point,  the starting symbol, and goes recursively down  
Video Player is loading.
 
Current Time 1:01
Duration 14:02
Loaded: 0.00%
 
store it as our property and since this parser  is known as recursive descent parser - what it  
x1.00


Back

Games & Quizzes

Training Mode - Typing
Fill the gaps to the Lyric - Best method
Training Mode - Picking
Pick the correct word to fill in the gap
Fill In The Blank
Find the missing words in a sentence Requires 5 vocabulary annotations
Vocabulary Match
Match the words to the definitions Requires 10 vocabulary annotations

You may need to watch a part of the video to unlock quizzes

Don't forget to Sign In to save your points

Challenge Accomplished

PERFECT HITS +NaN
HITS +NaN
LONGEST STREAK +NaN
TOTAL +
- //

We couldn't find definitions for the word you were looking for.
Or maybe the current language is not supported

  • 00:06

    Hi and welcome to the first session of the  "Building a Parser from scratch class"! My  

  • 00:10

    name is Dmitry Soshnikov and I'll be teaching this  class and just to set the agenda - this is purely  

  • 00:16

    practical class, right we will have just few  amount of theory in this class related to parsing,  

  • 00:20

    but most of the videos will be actually practical  exercises, building a parser from scratch, and  

  • 00:25

    we'll be building a parser for a full programming  language similar in syntax to Java or JavaScript.  

  • 00:31

    And for those of you who are interested  in deeper theory for parsing and parsing  

  • 00:36

    algorithms you may consider also the "Essentials  of Parsing" class where we discuss in detail all  

  • 00:42

    the parsing algorithms such as LL(1), LR(1)  parsing, etc, and we also built a parser in  

  • 00:48

    that class using a parser generator tool. And in  the current class - in the Building a Parser from  

  • 00:53

    scratch" - we will be building a manual parsing,  Recursive-descent, and we will see the whole  

  • 00:58

    implementation, building a Tokenizer, Parsing  modules construction, AST nodes, etc. Right,  

  • 01:04

    since sometimes you don't need to go deeper to the  theory and just need to build an actual parser, an  

  • 01:10

    actual practical thing, so in this case this class  is exactly for you. And for those who have taken  

  • 01:15

    the "Essentials of Interpretation" class where  we built an interpreter for programming language,  

  • 01:21

    this parser will be a frontend, since in that  class we used S-expression format for actual  

  • 01:28

    interpretation, for actual runtime semantics. And  at the S-expression being actually the abstract  

  • 01:35

    syntax tree format fits the best for interpreters,  however when it comes to ergonomics, users of your  

  • 01:42

    language usually prefer writing something like  this which looks on the right hand side. So in the  

  • 01:50

    "Building a Parser from scratch" we will take the  syntax which is on the right hand side and we'll  

  • 01:56

    parse it into AST, into Abstract Syntax Tree. And  with this being said let's start from the parsing  

  • 02:03

    pipeline, we will slightly talk about theory and  already today we'll jump into implementation.  

  • 02:09

    So let's say we have the abstract program  which prints "hello", and the first module  

  • 02:13

    which meets our program is known as Tokenizer.  Sometimes it's called Lexer or Scanner, and it  

  • 02:20

    defines the Lexical analysis. And the purpose of  the tokenizer is to group individual characters  

  • 02:26

    into recognizable stream of tokens, right  as you can see here we scan two tokens:  

  • 02:31

    identifier with the value "print" and the string  with the value "hello". Again the purpose is just  

  • 02:36

    to group the characters into higher abstracted  entities since it's more convenient to work with  

  • 02:42

    these tokens versus the individual characters.  And as you can see token has type and the value.  

  • 02:48

    And we also should say that the tokenizer  doesn't make any commitment whether our program  

  • 02:54

    is syntactically valid, right in this case as we  can see we have two tokens: the keyword "if" and  

  • 03:00

    the open parenthesis "(" and for us, the readers  of this code, this program is definitely invalid,  

  • 03:06

    but the tokenizer will normally scan this  program and will extract two tokens. Now  

  • 03:12

    the module which actually validates whether  our program is syntactically valid is known  

  • 03:16

    as Parser or Syntactic analysis. And in theory the  purpose of the parser is exactly the validation,  

  • 03:24

    but on practice the parser produces the next  intermediate representation known as Abstract  

  • 03:28

    Syntax Tree or the AST for short. And in this  case as you can see we have the Call expression  

  • 03:35

    with the function name "print" and arguments,  with the single argument, the word "hello".  

  • 03:42

    So let's see one more example of the AST. So we  have the expression "x = 10 * 5 + y". And the  

  • 03:51

    abstract syntax trees usually have the operators  or the function names as the interior or the root  

  • 03:58

    nodes - in this case we have the assignment  operator "=" which has left hand side and the  

  • 04:03

    right hand side, and as you can see the right hand  side here is a complex expression itself - it's  

  • 04:09

    the "+" operator which also has the left hand side  and the right hand side, again the left hand side  

  • 04:15

    is the complex expression "*", and the right hand  side is the identifier "y". Okay so the purpose of  

  • 04:24

    the parser again is to translate this string into  this tree, and this tree is the form which can  

  • 04:31

    be further passed either to the code generator  or to interpreter to obtain the actual result.  

  • 04:37

    And if we look again at the syntax, sometimes  people who are not familiar with parsing process  

  • 04:43

    start thinking: "...how do you actually parse the  string into the tree?.." And the first approach  

  • 04:50

    people try is just using Regular Expressions. But  in fact regular expressions don't work for parsing  

  • 04:55

    programming languages. And in the parsing process  we do use regular expressions but those are used  

  • 05:01

    only for the tokenizer module, right if we take  these two modules - tokenizer and the parser - the  

  • 05:07

    tokenizer, that is the scanner, uses exactly the  regular expressions, and we'll be using them a lot  

  • 05:13

    in the tokenizer module. As you can see here we  define the regular expression for numbers \d+  

  • 05:21

    which means "a digit repeated one or more times",  and we give a name for this regular expression,  

  • 05:27

    right we name this token as "number". And the  parser itself is defined using something what  

  • 05:33

    is called Backus-Naur form or the BNF for short.  In this case we have the production "E" that is  

  • 05:41

    "expression" which derives only one number, right  this is very simple grammar, very simple program,  

  • 05:49

    which accepts only numbers. And this program can  accept any number since the regular expression  

  • 05:54

    which is used in tokenizer accepts any digits. And  the Backus-Naur form is exactly the notation to  

  • 06:01

    describe syntaxes or grammars of languages. Let's  take a look at one more example. So here we have  

  • 06:07

    the main entry point which is called "Program"  and we say the Program consists of StatementList.  

  • 06:14

    The StatementList is the list of statements,  and each statement might be anything, for  

  • 06:18

    example here we have BlockStatement, IfStatement,  FunctionDeclaration, etc. And we're recursively  

  • 06:26

    defining each production and describe what it  means. For example if the FunctionDeclaration here  

  • 06:32

    is defined as the "def" keyword, as you can see  here with the cursive we define the actual tokens,  

  • 06:39

    right not the production rules but actual  tokens which can appear in our programs "as is".  

  • 06:44

    So the function declaration is defined as "def",  followed by the function name which is Identifier,  

  • 06:50

    followed by the Arguments which are in the  parenthesis, and followed by the function  

  • 06:55

    body which is nothing but the BlockStatement. Okay  so that's the BNF grammar which we'll be following  

  • 07:02

    when we'll be building our manual recursive  descent parser. And when we think about  

  • 07:07

    tree format, the simple format which is also the  most practical today, is just the JSON notation,  

  • 07:14

    right each AST node is represented as the object  which has the "type" property, for example for the  

  • 07:20

    expression "7 + 3 * 4" we see the BinaryExpression  with the operator "+", okay which has left hand  

  • 07:30

    side, which is number 7, NumericLiteral, and the  right hand side which is a complex expression  

  • 07:36

    itself, also the BinaryExpression, but it  has different operator, the "*" operator.  

  • 07:43

    And as you can see here we have the correct  precedence of operations: here the "*" operator  

  • 07:48

    has the higher precedence than the "+" operator,  and this is also something parser should enforce.  

  • 07:55

    And when we talk about parsers in general  we differentiate the hand-written parsers  

  • 08:00

    and automatically generated. And this class is  devoted specifically to the hand-written parsers  

  • 08:06

    where the main implementation and the most  practical today (and the most powerful)  

  • 08:11

    is known as Recursive-descent parser. So we'll  be building exactly this parser in this class,  

  • 08:17

    but also the parsing process can be automated.  And here we have all these cryptic names:  

  • 08:22

    LL, LR, GLR, PEG, etc, and you can take  a look at the parser generator tool  

  • 08:30

    called Syntax which is  language-agnostic parser generator.  

  • 08:35

    Basically what it does is, you can take a  grammar and just create an automated parser.  

  • 08:40

    But when we have a complex language and want to  understand the parsing process in full details,  

  • 08:46

    the manual recursive descent parser is pretty much  the standard today for the most production-level  

  • 08:52

    languages, right since you have the full  control of the parser and you may define  

  • 08:56

    better error handling, better error messages, and  build more sophisticated parsers which are not  

  • 09:02

    possible with automated parsers. Again for those  of you who are interested in automated parsers  

  • 09:07

    and in parsing algorithms, you can address the  class called "Essentials of Parsing" where we  

  • 09:13

    discuss in detail all these parsing algorithms  and build a parser using a parser generator.  

  • 09:19

    So let's jump right away into practice, and  with this being said, please meet Letter!  

  • 09:25

    The Letter is a syntax for a programming  language with a functional heart and  

  • 09:29

    object-oriented support. And this syntax will be  parsed into AST, right we already agreed to use  

  • 09:36

    JSON structure for AST with the "type" property,  but you may also parse into S-expression format  

  • 09:42

    which is also nothing but the AST, and we  will show how to transform different types  

  • 09:48

    of ASTs between each other. So once again the  Letter syntax on the left and on the right we  

  • 09:54

    see the parsed AST for the syntax. Okay with this  being said, let me switch to the implementation.  

  • 10:02

    Okay, so I'm going to create the directory for our  project called "letter-rdp", and RDP here means  

  • 10:09

    the Recursive-descent parser. And let's make the  source Parser.js - we'll be using Node.js and  

  • 10:17

    JavaScript to build this parser, please address  how to install Node.js, and the JavaScript being  

  • 10:23

    the most practical language today should  be accessible for many engineers. And let's  

  • 10:28

    start building our parser. Now the main API  of the parser is exactly the "parse" method  

  • 10:33

    which should accept a string and return an AST.  Okay so let's do this - we accept the string,  

  • 10:40

    store it as our property, and since this parser  is known as recursive descent parser - what it  

  • 10:47

    means - it starts from the main entry point,  the starting symbol, and goes recursively down  

  • 10:53

    until it parses the full tree. So let's  name the main entry point as the Program,  

  • 10:59

    okay. And for each production in the  grammar we're going to have the handler  

  • 11:03

    function with the same name, so for the Program  production we have the Program function name.  

  • 11:09

    And today we're going to support only numbers,  let's name the numbers as NumericLiteral. In  

  • 11:15

    the documentation for each handler of the  production we will write the actual grammar,  

  • 11:21

    right what this function or what this grammar  rule derives. In this case we say Program  

  • 11:27

    derives NumericLiteral. In other words we say, our  program may consist only of one number. And that  

  • 11:35

    exactly what we return - the Program is just  one number, that is just one NumericLiteral.  

  • 11:40

    And since we introduced the new function,  NumericLiteral, we need to define it. And what  

  • 11:45

    is NumericLiteral? Well, it's just a number.  This NUMBER, used in the upper-case letters,  

  • 11:51

    is coming from the tokenizer. We will define the  tokenizer in the next lecture and today we will  

  • 11:56

    consider the whole string as just containing one  number. So what should return the NumericLiteral?  

  • 12:03

    As we said, any production should return an  AST node, and we agreed that our AST nodes will  

  • 12:09

    have the "type" which is the "NumericLiteral",  and some properties associated with this type.  

  • 12:15

    In this case the NumericLiteral has only one  property, let's call it "value", and the value  

  • 12:20

    should contain the numeric representation  of the number from the parsing string,  

  • 12:27

    since the string contains of characters, that  is the string characters, we need to convert the  

  • 12:32

    string into a number, so we use the whole string  as the program and just convert it to a number.  

  • 12:38

    At this point we don't use any tokenizer and just  support only a single number. And this should be  

  • 12:45

    it. Okay, let's write the small test runner  - I'm creating the __tests__ directory and  

  • 12:52

    the main test runner, and for now let's just  use directly program evaluation - I'm creating  

  • 12:59

    the parser instance from our Parser class. As  we said the first program will be very simple,  

  • 13:06

    it will support only one number, let's say  42. And as just in any parser API we call  

  • 13:11

    the "parse" method passing the string  program, and we should obtain the AST.  

  • 13:18

    Okay, let's execute, and yes - we have the first  AST node from the parser, and it's NumericLiteral.  

  • 13:24

    As you can see the 42 number is correctly  presented here as the number but not as a string  

  • 13:29

    because we use the Number(...) conversion.  And this parser should support any number,  

  • 13:35

    let's try something else, and yes - as you can  see it's supported. Okay so that's it for today,  

  • 13:41

    this is the introduction lecture and the agenda  what we'll be building. So in the next lecture  

  • 13:46

    we'll continue working with our parser and we'll  introduce the tokenizer and we'll be extending  

  • 13:50

    the language, adding more support for other  constructs. Okay, thanks and see you in the class.

All

The example sentences of RECURSIVELY in videos (3 in total of 3)

then adverb you personal pronoun repeat verb, non-3rd person singular present the determiner the determiner procedure noun, singular or mass recursively proper noun, singular for preposition or subordinating conjunction each determiner branch noun, singular or mass , selecting verb, gerund or present participle an determiner attribute verb, non-3rd person singular present at preposition or subordinating conjunction each determiner node noun, singular or mass ,
means verb, 3rd person singular present - it personal pronoun starts noun, plural from preposition or subordinating conjunction the determiner main adjective entry noun, singular or mass point noun, singular or mass , the determiner starting verb, gerund or present participle symbol noun, singular or mass , and coordinating conjunction goes verb, 3rd person singular present recursively proper noun, singular down adverb
and coordinating conjunction we personal pronoun are verb, non-3rd person singular present going verb, gerund or present participle to to recursively proper noun, singular search noun, singular or mass for preposition or subordinating conjunction those determiner file noun, singular or mass names noun, plural , so preposition or subordinating conjunction we personal pronoun enter verb, non-3rd person singular present lfo proper noun, singular

Definition and meaning of RECURSIVELY

What does "recursively mean?"

adverb
.