Library

Video Player is loading.
 
Current Time 0:00
Duration 16:03
Loaded: 0.00%
 

x1.00


Back

Games & Quizzes

Training Mode - Typing
Fill the gaps to the Lyric - Best method
Training Mode - Picking
Pick the correct word to fill in the gap
Fill In The Blank
Find the missing words in a sentence Requires 5 vocabulary annotations
Vocabulary Match
Match the words to the definitions Requires 10 vocabulary annotations

You may need to watch a part of the video to unlock quizzes

Don't forget to Sign In to save your points

Challenge Accomplished

PERFECT HITS +NaN
HITS +NaN
LONGEST STREAK +NaN
TOTAL +
- //

We couldn't find definitions for the word you were looking for.
Or maybe the current language is not supported

  • 00:00

    [computer-generated gibberish]
    [computer-generated gibberish]

  • 00:03

    Yeah, alright, so hi everybody, it's me Cary /khhh/
    Yeah, alright, so hi everybody, it's me Cary /khhh/

  • 00:06

    Now, I've always thought of myself as a musical person
    Now, I've always thought of myself as a musical person

  • 00:10

    [loud singing, recorder screeching, and rubber chicken shrieking]
    [loud singing, recorder screeching, and rubber chicken shrieking]

  • 00:16

    Isn't it amazing?
    Isn't it amazing?

  • 00:17

    [sigh] No. No, Cary that isn't amazing.
    [sigh] No. No, Cary that isn't amazing.

  • 00:20

    Anyway given that I've used AI to compose Baroque music
    Anyway given that I've used AI to compose Baroque music

  • 00:24

    [Computery's Baroque music]
    [Computery's Baroque music]

  • 00:29

    And I've used AI to compose jazz music
    And I've used AI to compose jazz music

  • 00:32

    [Computery's jazz music]
    [Computery's jazz music]

  • 00:37

    I think it just makes sense for me to fast-forward the musical clock another 60 years to compose some rap music
    I think it just makes sense for me to fast-forward the musical clock another 60 years to compose some rap music

  • 00:44

    But before I do that,
    But before I do that,

  • 00:45

    I gotta give credit to Siraj Raval, who actually did this first.
    I gotta give credit to Siraj Raval, who actually did this first.

  • 00:49

    homie grows on E like Leone totin inspired enough
    homie grows on E like Leone totin inspired enough

  • 00:52

    But you know what they say: No rap battle is complete without two contenders
    But you know what they say: No rap battle is complete without two contenders

  • 00:56

    So what did I do to build my own digital rap god?
    So what did I do to build my own digital rap god?

  • 01:00

    Well, I used Andrej Karpathy's recurrent neural network code again
    Well, I used Andrej Karpathy's recurrent neural network code again

  • 01:04

    An RNN is just an ordinary neural network
    An RNN is just an ordinary neural network

  • 01:07

    But we give it a way to communicate with its future self with this hidden state meaning it can store memory.
    But we give it a way to communicate with its future self with this hidden state meaning it can store memory.

  • 01:13

    Now I've done this countless times before so I won't dive too deep into what an RNN is
    Now I've done this countless times before so I won't dive too deep into what an RNN is

  • 01:18

    Instead I want to focus more on a twist I implemented that makes this quote/unquote "algorithm" more musical.
    Instead I want to focus more on a twist I implemented that makes this quote/unquote "algorithm" more musical.

  • 01:24

    Before I do that though
    Before I do that though

  • 01:26

    I need to introduce you to Dave from boyinaband.
    I need to introduce you to Dave from boyinaband.

  • 01:28

    He's, um, a tad bit good at rapping I guess
    He's, um, a tad bit good at rapping I guess

  • 01:32

    [definitely more than "a tad bit good" rapping]
    [definitely more than "a tad bit good" rapping]

  • 01:39

    So when I first trained Karpathy's RNN to generate rap lyrics in 2017
    So when I first trained Karpathy's RNN to generate rap lyrics in 2017

  • 01:43

    I invited him over to read the lyrics my algorithm had written
    I invited him over to read the lyrics my algorithm had written

  • 01:47

    but then I lost the footage and then he lost the footage and
    but then I lost the footage and then he lost the footage and

  • 01:51

    Well, long story short, there's no footage of it ever happening. That made me bummed for a bit
    Well, long story short, there's no footage of it ever happening. That made me bummed for a bit

  • 01:55

    But then I realized this could be interpreted as a sign from above
    But then I realized this could be interpreted as a sign from above

  • 01:59

    Perhaps the AI prevented us humans from rapping its song because it wanted to do the rap itself!
    Perhaps the AI prevented us humans from rapping its song because it wanted to do the rap itself!

  • 02:05

    Well Computery if you insist.
    Well Computery if you insist.

  • 02:08

    To give Computery a voice,
    To give Computery a voice,

  • 02:09

    I downloaded this Python module that lets us use Google's text-to-speech software directly
    I downloaded this Python module that lets us use Google's text-to-speech software directly

  • 02:16

    I'm pretty sure you've heard this text-to-speech voice before.
    I'm pretty sure you've heard this text-to-speech voice before.

  • 02:20

    Now, as we hear Computery's awesome rap
    Now, as we hear Computery's awesome rap

  • 02:22

    I'm gonna show the lyrics on screen. If you're up for it, you viewers out there can sing along too!
    I'm gonna show the lyrics on screen. If you're up for it, you viewers out there can sing along too!

  • 02:28

    Alright, let's drop this track
    Alright, let's drop this track

  • 02:38

    Wait, why aren't you singing along?
    Wait, why aren't you singing along?

  • 02:41

    WHY AREN'T YOU-
    WHY AREN'T YOU-

  • 02:41

    The reason it performed so badly is because it hasn't had any training data to learn from.
    The reason it performed so badly is because it hasn't had any training data to learn from.

  • 02:46

    So let's go find some training data. With my brother's help,
    So let's go find some training data. With my brother's help,

  • 02:49

    I used a large portion of the Original Hip-Hop Lyrics Archive as my data set to train my algorithm on.
    I used a large portion of the Original Hip-Hop Lyrics Archive as my data set to train my algorithm on.

  • 02:55

    This includes works by rap giants like Kendrick Lamar and Eminem
    This includes works by rap giants like Kendrick Lamar and Eminem

  • 02:59

    We stitched around 6,000 songs into one giant text file
    We stitched around 6,000 songs into one giant text file

  • 03:03

    (Separated with line breaks) to create our final data set of 17 million text characters
    (Separated with line breaks) to create our final data set of 17 million text characters

  • 03:09

    Wait, that's only 17 megabytes. A single 4-minute video typically takes up more space than that.
    Wait, that's only 17 megabytes. A single 4-minute video typically takes up more space than that.

  • 03:14

    Yeah, it turns out that text as a data type is incredibly dense.
    Yeah, it turns out that text as a data type is incredibly dense.

  • 03:18

    You can store a lot of letters in the same amount of space as a short video. Let's see the algorithm learned.
    You can store a lot of letters in the same amount of space as a short video. Let's see the algorithm learned.

  • 03:23

    Okay, ready? Go-stop
    Okay, ready? Go-stop

  • 03:25

    As you can see, after just 200 milliseconds less than a blink of an eye
    As you can see, after just 200 milliseconds less than a blink of an eye

  • 03:29

    It learned to stop putting spaces everywhere
    It learned to stop putting spaces everywhere

  • 03:32

    in the data set. You'll rarely see more than two spaces in a row
    in the data set. You'll rarely see more than two spaces in a row

  • 03:35

    So it makes sense that the AI would learn to avoid doing that too
    So it makes sense that the AI would learn to avoid doing that too

  • 03:39

    However, I can see it still putting in uncommon patterns like double I's and capital letters in the middle of words
    However, I can see it still putting in uncommon patterns like double I's and capital letters in the middle of words

  • 03:45

    So let's keep training to see if it learns to fix that
    So let's keep training to see if it learns to fix that

  • 03:48

    We're half a second into training now and the pesky double I's seem to have vanished
    We're half a second into training now and the pesky double I's seem to have vanished

  • 03:52

    The AI has also drastically shortened the length of its lines.
    The AI has also drastically shortened the length of its lines.

  • 03:56

    But behind the scenes, that's actually caused by an increase of the frequency of the line break character.
    But behind the scenes, that's actually caused by an increase of the frequency of the line break character.

  • 04:00

    For the AI, the line break is just like any other text character
    For the AI, the line break is just like any other text character

  • 04:05

    However, to match the data set
    However, to match the data set

  • 04:07

    we need a good combination of both line breaks and spaces
    we need a good combination of both line breaks and spaces

  • 04:11

    Which we actually get in the next iteration!
    Which we actually get in the next iteration!

  • 04:13

    And here we see the AI's first well-formatted word: "it"
    And here we see the AI's first well-formatted word: "it"

  • 04:18

    Wait, does "eco" count as a word? Not sure about that.
    Wait, does "eco" count as a word? Not sure about that.

  • 04:21

    Oh my god, you guys. Future Cary here
    Oh my god, you guys. Future Cary here

  • 04:23

    I realize that's not an uppercase I, it's a lowercase L. Major 2011 vibes.
    I realize that's not an uppercase I, it's a lowercase L. Major 2011 vibes.

  • 04:26

    Now at one full second into training,
    Now at one full second into training,

  • 04:28

    We see the AI has learned that commas are often not followed by letters directly
    We see the AI has learned that commas are often not followed by letters directly

  • 04:33

    There should be a space or a line break afterwards.
    There should be a space or a line break afterwards.

  • 04:36

    By the way, the average human reads at 250 words per minute
    By the way, the average human reads at 250 words per minute

  • 04:39

    So a human learning how to rap alongside the AI has currently read...
    So a human learning how to rap alongside the AI has currently read...

  • 04:44

    Four words.
    Four words.

  • 04:46

    I'm gonna let it run in the background as I talk about other stuff
    I'm gonna let it run in the background as I talk about other stuff

  • 04:48

    So one thing I keep getting asked is "what is loss?"
    So one thing I keep getting asked is "what is loss?"

  • 04:52

    Basically, when a neural network makes a guess about what the next letter is gonna be,
    Basically, when a neural network makes a guess about what the next letter is gonna be,

  • 04:56

    it assigns a probability to each letter type
    it assigns a probability to each letter type

  • 04:59

    And loss just measures how far away those probabilities were from the true answer given by the data set on average
    And loss just measures how far away those probabilities were from the true answer given by the data set on average

  • 05:05

    So lower loss usually means the model can predict true rap lyrics better
    So lower loss usually means the model can predict true rap lyrics better

  • 05:10

    Now I'm playing the training time-lapse 10 times faster
    Now I'm playing the training time-lapse 10 times faster

  • 05:13

    The loss function actually held pretty constant for the first 18 seconds
    The loss function actually held pretty constant for the first 18 seconds

  • 05:16

    Then it started to drop.
    Then it started to drop.

  • 05:18

    That big drop corresponds to the text looking much more English,
    That big drop corresponds to the text looking much more English,

  • 05:21

    With the lines finally beginning to start with capital letters (took long enough)
    With the lines finally beginning to start with capital letters (took long enough)

  • 05:25

    And common words like "you," "I," and "the" making their first appearance
    And common words like "you," "I," and "the" making their first appearance

  • 05:32

    By 54 seconds I'd say about half of the words are real
    By 54 seconds I'd say about half of the words are real

  • 05:35

    So rudimentary grammar rules can start forming
    So rudimentary grammar rules can start forming

  • 05:39

    "Of the" is one of the most common bigrams in the English language, and here it is.
    "Of the" is one of the most common bigrams in the English language, and here it is.

  • 05:43

    Also, apostrophes are starting to be used for contractions and we're seeing the origins of one word interjections
    Also, apostrophes are starting to be used for contractions and we're seeing the origins of one word interjections

  • 05:52

    Over a minute in, we see the square bracket format start showing up.
    Over a minute in, we see the square bracket format start showing up.

  • 05:56

    In the data set, square brackets were used to denote which rapper was speaking at any given time
    In the data set, square brackets were used to denote which rapper was speaking at any given time

  • 06:00

    So that means our baby AI's choice of rappers are Guhe Comi, Moth, and Berse Dog Rlacee
    So that means our baby AI's choice of rappers are Guhe Comi, Moth, and Berse Dog Rlacee

  • 06:06

    I also want to quickly point out how much doing this relies on the memory I described earlier.
    I also want to quickly point out how much doing this relies on the memory I described earlier.

  • 06:11

    As Andrej's article shows, certain neurons of the network
    As Andrej's article shows, certain neurons of the network

  • 06:14

    have to be designated to fire only when you're inside the brackets to remember that you have to close them
    have to be designated to fire only when you're inside the brackets to remember that you have to close them

  • 06:19

    at some point to avoid bracket imbalance
    at some point to avoid bracket imbalance

  • 06:23

    [sigh] Okay, this is the point in the video where I have to discuss swear words
    [sigh] Okay, this is the point in the video where I have to discuss swear words

  • 06:28

    I know a good chunk of my audience is children. So typically I'd censor this out
    I know a good chunk of my audience is children. So typically I'd censor this out

  • 06:32

    However, given the nature of our rap data set
    However, given the nature of our rap data set

  • 06:34

    I don't think it's possible to accurately judge the neural networks performance if we were to do that .
    I don't think it's possible to accurately judge the neural networks performance if we were to do that .

  • 06:38

    Besides I've included swears in my videos before; people just didn't notice.
    Besides I've included swears in my videos before; people just didn't notice.

  • 06:42

    But that means if you're a kid under legal swearing age, I'm kindly asking you to leave to preserve your precious ears
    But that means if you're a kid under legal swearing age, I'm kindly asking you to leave to preserve your precious ears

  • 06:48

    But if you won't leave I'll have to scare you away
    But if you won't leave I'll have to scare you away

  • 06:53

    Ready?
    Ready?

  • 06:54

    Shit [gasp] fuck [GASP] bitch [GASP!] Peter Ruette [AAAH!]
    Shit [gasp] fuck [GASP] bitch [GASP!] Peter Ruette [AAAH!]

  • 06:59

    But with that being said,
    But with that being said,

  • 07:00

    There is one word that's prevalent in raps that- ah- that I don't think I'm in the position to say and- ah
    There is one word that's prevalent in raps that- ah- that I don't think I'm in the position to say and- ah

  • 07:06

    Dang it. Why is this glue melting? Okay. Well, I'm pretty sure we all know what word I'm talking about
    Dang it. Why is this glue melting? Okay. Well, I'm pretty sure we all know what word I'm talking about

  • 07:09

    So in the future, I'm just going to place all occurrences of that word with ninja
    So in the future, I'm just going to place all occurrences of that word with ninja

  • 07:21

    After two minutes, it's learned to consistently put two line breaks in between stanzas
    After two minutes, it's learned to consistently put two line breaks in between stanzas

  • 07:27

    and the common label "chorus" is starting to show up (correctly)
    and the common label "chorus" is starting to show up (correctly)

  • 07:31

    Also, did you notice the mysterious line "Typed by OHHLA webmaster DJ Flash"?
    Also, did you notice the mysterious line "Typed by OHHLA webmaster DJ Flash"?

  • 07:35

    That doesn't sound like a rap lyric! Well, it's not.
    That doesn't sound like a rap lyric! Well, it's not.

  • 07:39

    It appeared 1172 times in the data set as part of the header of every song that the webmaster transcribed.
    It appeared 1172 times in the data set as part of the header of every song that the webmaster transcribed.

  • 07:45

    Now over the next 10 minutes the lyrics gradually got better
    Now over the next 10 minutes the lyrics gradually got better

  • 07:49

    It learned more intricate grammar rules like that "motherfuckin'" should be followed by a noun,
    It learned more intricate grammar rules like that "motherfuckin'" should be followed by a noun,

  • 07:53

    but the improvements became less and less significant
    but the improvements became less and less significant

  • 07:55

    So what you see around 10 minutes is about as good as it's gonna get
    So what you see around 10 minutes is about as good as it's gonna get

  • 07:59

    After all, I set the number of synapses to a constant 5 million
    After all, I set the number of synapses to a constant 5 million

  • 08:03

    And there's only so much information you can fit into 5 million synapses
    And there's only so much information you can fit into 5 million synapses

  • 08:07

    Anyway, I ran the training overnight and got it to produce this 600-line file
    Anyway, I ran the training overnight and got it to produce this 600-line file

  • 08:11

    If you don't look at it too long, you could be convinced they're real lyrics
    If you don't look at it too long, you could be convinced they're real lyrics

  • 08:15

    Patterns shorter than a sentence are replicated pretty well
    Patterns shorter than a sentence are replicated pretty well

  • 08:17

    But anything longer is a bit iffy
    But anything longer is a bit iffy

  • 08:19

    There are a few one-liners that came out right, like "now get it off" and "if you don't give a fuck about me"
    There are a few one-liners that came out right, like "now get it off" and "if you don't give a fuck about me"

  • 08:26

    The lines that are a little wonky like "a bust in the air" could be interpreted as poetic
    The lines that are a little wonky like "a bust in the air" could be interpreted as poetic

  • 08:31

    Oh, I also like it when a switches into shrieking mode
    Oh, I also like it when a switches into shrieking mode

  • 08:34

    But anyway, we can finally feed this into Google's text-to-speech to hear it rap once and for all
    But anyway, we can finally feed this into Google's text-to-speech to hear it rap once and for all

  • 08:49

    Hold on! That was actually pretty bad.
    Hold on! That was actually pretty bad.

  • 08:52

    The issue here is we gave our program no way to implement rhythm
    The issue here is we gave our program no way to implement rhythm

  • 08:56

    Which in my opinion is the most important element to making a rap flow.
    Which in my opinion is the most important element to making a rap flow.

  • 08:59

    So, how do we implement this rhythm?
    So, how do we implement this rhythm?

  • 09:01

    Well, this is the twist I mentioned earlier in the video.
    Well, this is the twist I mentioned earlier in the video.

  • 09:05

    There's two methods. Method one would be to manually time stretch and time squish syllables
    There's two methods. Method one would be to manually time stretch and time squish syllables

  • 09:10

    To match a pre-picked rhythm using some audio editing software
    To match a pre-picked rhythm using some audio editing software

  • 09:14

    For this I picked my brother's song "3000 subbies"
    For this I picked my brother's song "3000 subbies"

  • 09:18

    And I also used Melodyne to auto-tune each syllable to the right pitch. So it's more of a song.
    And I also used Melodyne to auto-tune each syllable to the right pitch. So it's more of a song.

  • 09:29

    Although that's not required for rap.
    Although that's not required for rap.

  • 09:32

    So how does the final result actually sound? I'll let you be the judge
    So how does the final result actually sound? I'll let you be the judge

  • 10:00

    I think that sounded pretty fun and I'm impressed with Google's vocal range.
    I think that sounded pretty fun and I'm impressed with Google's vocal range.

  • 10:04

    However, it took me two hours to time align everything
    However, it took me two hours to time align everything

  • 10:07

    And the whole reason we used AI was to have a program to automatically generate our rap songs.
    And the whole reason we used AI was to have a program to automatically generate our rap songs.

  • 10:13

    So we've missed the whole point!
    So we've missed the whole point!

  • 10:14

    That means we should focus on method two: automatic algorithmic time alignment.
    That means we should focus on method two: automatic algorithmic time alignment.

  • 10:19

    How do we do that?
    How do we do that?

  • 10:20

    Well firstly notice that most rap background tracks are in the time signature 4/4 or some multiple of it
    Well firstly notice that most rap background tracks are in the time signature 4/4 or some multiple of it

  • 10:27

    Subdivisions of beats as well as full stanzas also come in powers of two
    Subdivisions of beats as well as full stanzas also come in powers of two

  • 10:31

    So all rhythms seem to depend closely on this exponential series
    So all rhythms seem to depend closely on this exponential series

  • 10:35

    My first approach was to detect the beginning of each spoken syllable
    My first approach was to detect the beginning of each spoken syllable

  • 10:39

    And quantize or snap that syllable to the nearest half beat
    And quantize or snap that syllable to the nearest half beat

  • 10:43

    That means syllables will sometimes fall on the beat
    That means syllables will sometimes fall on the beat

  • 10:46

    just. like. this.
    just. like. this.

  • 10:47

    But even if it fell off the beat we'd get cool syncopation, just. like. this. which is more groovy
    But even if it fell off the beat we'd get cool syncopation, just. like. this. which is more groovy

  • 10:54

    Does this work? Actually, no.
    Does this work? Actually, no.

  • 10:56

    Because it turns out detecting the beginning of syllables from waveforms is not so easy.
    Because it turns out detecting the beginning of syllables from waveforms is not so easy.

  • 11:01

    Some sentences, like "come at me, bro"
    Some sentences, like "come at me, bro"

  • 11:04

    Are super clear, but others like
    Are super clear, but others like

  • 11:06

    "Hallelujah our auroras are real"
    "Hallelujah our auroras are real"

  • 11:09

    Are not so clear.
    Are not so clear.

  • 11:10

    And I definitely don't want to have to use phoneme extraction. It's too cumbersome
    And I definitely don't want to have to use phoneme extraction. It's too cumbersome

  • 11:13

    So here's what I actually did: I cut corners
    So here's what I actually did: I cut corners

  • 11:16

    Listening to lots of real rap,
    Listening to lots of real rap,

  • 11:18

    I realized the most important syllables to focus on were the first and last syllables of each line
    I realized the most important syllables to focus on were the first and last syllables of each line

  • 11:23

    Since they anchor everything in place
    Since they anchor everything in place

  • 11:24

    The middle syllables can fall haphazardly
    The middle syllables can fall haphazardly

  • 11:27

    And the listeners brain will hopefully find some pattern in there to cling to
    And the listeners brain will hopefully find some pattern in there to cling to

  • 11:31

    Fortunately human brains are pretty good at finding patterns where there aren't any
    Fortunately human brains are pretty good at finding patterns where there aren't any

  • 11:35

    So to find where the first syllable started,
    So to find where the first syllable started,

  • 11:37

    I analyzed where the audio amplitude first surpassed 0.2
    I analyzed where the audio amplitude first surpassed 0.2

  • 11:43

    And for the last syllable I found when the audio amplitude last surpassed 0.2 and literally subtracted a fifth of a second from it
    And for the last syllable I found when the audio amplitude last surpassed 0.2 and literally subtracted a fifth of a second from it

  • 11:50

    That's super janky and it doesn't account for these factors, but it worked in general
    That's super janky and it doesn't account for these factors, but it worked in general

  • 11:54

    From here I snapped those two landmarks to the nearest beat
    From here I snapped those two landmarks to the nearest beat

  • 11:58

    Time-dilating or contracting as necessary
    Time-dilating or contracting as necessary

  • 12:01

    Now if you squish audio the rudimentary way, you also affect its pitch, which I don't want.
    Now if you squish audio the rudimentary way, you also affect its pitch, which I don't want.

  • 12:06

    So I instead used the phase vocoder of the Python library Audio TSM to edit timing without affecting pitch
    So I instead used the phase vocoder of the Python library Audio TSM to edit timing without affecting pitch

  • 12:14

    Now instead of this:
    Now instead of this:

  • 12:23

    We get this:
    We get this:

  • 12:32

    That's pretty promising. We're almost at my final algorithm, but there's one final fix.
    That's pretty promising. We're almost at my final algorithm, but there's one final fix.

  • 12:37

    Big downbeats, which occur every 16 normal beats, are especially important
    Big downbeats, which occur every 16 normal beats, are especially important

  • 12:42

    Using our current method,
    Using our current method,

  • 12:44

    Google's TTS will just run through them like this:
    Google's TTS will just run through them like this:

  • 12:53

    Not only is that clunky, it's just plain rude
    Not only is that clunky, it's just plain rude

  • 12:56

    So I added a rule that checks if the next book-end line will otherwise run through the big downbeat.
    So I added a rule that checks if the next book-end line will otherwise run through the big downbeat.

  • 13:02

    And if so, it will instead wait for that big downbeat to start before speaking.
    And if so, it will instead wait for that big downbeat to start before speaking.

  • 13:06

    This is better, but we've also created awkward silences.
    This is better, but we've also created awkward silences.

  • 13:09

    So to fix that I introduced a second speaker
    So to fix that I introduced a second speaker

  • 13:16

    When speaker one encounters an awkward silence,
    When speaker one encounters an awkward silence,

  • 13:18

    Speaker 2 will fill in by echoing the last thing speaker once said and vice-versa.
    Speaker 2 will fill in by echoing the last thing speaker once said and vice-versa.

  • 13:23

    What we get from this is much more natural.
    What we get from this is much more natural.

  • 13:26

    Alright, so that's pretty much all I did for a rhythm alignment, and it vastly improves the flow of our raps.
    Alright, so that's pretty much all I did for a rhythm alignment, and it vastly improves the flow of our raps.

  • 13:31

    I think it's time for you to hear a full-blown song that this algorithm generated.
    I think it's time for you to hear a full-blown song that this algorithm generated.

  • 13:35

    Are you ready to experience Computery's first single?
    Are you ready to experience Computery's first single?

  • 13:38

    I know I sure am.
    I know I sure am.

All

AI Learns to Write Rap Lyrics!

1,423,992 views

Video Language:

  • English

Caption Language:

  • English (en)

Accent:

  • English

Speech Time:

74%
  • 12:01 / 16:03

Speech Rate:

  • 187 wpm - Fast

Category:

  • Gaming

Intro:

[computer-generated gibberish]. Yeah, alright, so hi everybody, it's me Cary /khhh/
Now, I've always thought of myself as a musical person
[loud singing, recorder screeching, and rubber chicken shrieking]
Isn't it amazing?. [sigh] No. No, Cary that isn't amazing.. Anyway given that I've used AI to compose Baroque music
[Computery's Baroque music]. And I've used AI to compose jazz music. [Computery's jazz music]. I think it just makes sense for me to fast-forward the musical clock another 60 years to compose some rap music
But before I do that,. I gotta give credit to Siraj Raval, who actually did this first.
homie grows on E like Leone totin inspired enough. But you know what they say: No rap battle is complete without two contenders
So what did I do to build my own digital rap god?. Well, I used Andrej Karpathy's recurrent neural network code again
An RNN is just an ordinary neural network. But we give it a way to communicate with its future self with this hidden state meaning it can store memory.
Now I've done this countless times before so I won't dive too deep into what an RNN is

Video Vocabulary

/ˈdijidl/

adjective

Using electronic signals or computers.

/əˈnəT͟Hər/

adjective determiner pronoun

One more, but not this. One more added. One more (thing).

/rəˈkərənt/

adjective

occurring often or repeatedly.

/wəˈT͟Hout/

adverb conjunction preposition

outside. without it being case that. in absence of.

/ˈmyo͞ozək(ə)l/

adjective noun

Having a pleasant sound like music. Play or movie set to music.

/ˈCHikən/

adjective noun verb

Acting like a coward. A bird raised for its eggs and meat. be afraid.

/ˈnetˌwərk/

noun verb

Group of computers connected to each other. To join a group of computers together.

verb

put (decision, plan, agreement, etc.) into effect.

/ˈmēniNG/

adjective noun verb

expressive. what is meant by word, text, etc.. To intend to do something in particular.

/rap/

noun verb

Singing in the rap style. strike hard surface with series of rapid audible blows.

/kəˈmyo͞onəˌkāt/

verb

exchange information or ideas.

/ˈkoun(t)ləs/

adjective

Being too many to be numbered or imagined.

/ˌintrəˈd(y)o͞os/

verb

To make someone known to another by name.

/kəmˈplēt/

adjective verb

having all necessary parts. To finish or reach the end of doing something.

/ˈak(t)SH(o͞o)əlē/

adverb

as truth or facts.