Library

Video Player is loading.
 
Current Time 0:00
Duration -:-
Loaded: 0%
 
x1.00


Back

Games & Quizzes

Training Mode - Typing
Fill the gaps to the Lyric - Best method
Training Mode - Picking
Pick the correct word to fill in the gap
Fill In The Blank
Find the missing words in a sentence Requires 5 vocabulary annotations
Vocabulary Match
Match the words to the definitions Requires 10 vocabulary annotations

You may need to watch a part of the video to unlock quizzes

Don't forget to Sign In to save your points

Challenge Accomplished

PERFECT HITS +NaN
HITS +NaN
LONGEST STREAK +NaN
TOTAL +
- //

We couldn't find definitions for the word you were looking for.
Or maybe the current language is not supported

  • 00:17

    Hi! This is Lesson 3.3 on using probabilities.

  • 00:21

    It's the one bit of Data Mining with Weka that we're going to see a little bit of mathematics,

  • 00:26

    but don't worry, I'll take you through it gently.

  • 00:31

    The OneR strategy that we've just been studying assumes that there is one of the attributes

  • 00:36

    that does all the work, that takes the responsibility of the decision.

  • 00:41

    That's a simple strategy.

  • 00:43

    Another simple strategy is the opposite, to assume all of the attributes contribute equally

  • 00:48

    and independently to the decision.

  • 00:51

    This is called the "Naive Bayes" method --

  • 00:54

    I'll explain the name later on.

  • 00:56

    There are two assumptions that underline Naive Bayes: that the attributes are equally important

  • 01:02

    and that they are statistically independent,

  • 01:05

    that is, knowing the value of one of the attributes doesn't tell you anything about the value

  • 01:09

    of any of the other attributes.

  • 01:12

    This independence assumption is never actually correct, but the method based on it often

  • 01:17

    works well in practice.

  • 01:23

    There's a theorem in probability called "Bayes Theorem" after this guy Thomas Bayes from the

  • 01:30

    18th century.

  • 01:33

    It's about the probability of a hypothesis H given evidence E.

  • 01:39

    In our case, the hypothesis is the class of an instance and the evidence is the attribute

  • 01:46

    values of the instance.

  • 01:48

    The theorem is that Pr[H|E] -- the probability of the class given the instance, the hypothesis

  • 01:55

    given the evidence -- is equal to Pr[E|H] times Pr[H] divided

  • 02:02

    by Pr[E].

  • 02:06

    Pr[H] by itself is called the [prior] probability of the hypothesis H.

  • 02:13

    That's the probability of the event before any evidence is seen.

  • 02:18

    That's really the baseline probability of the event.

  • 02:22

    For example, in the weather data, I think there are 9 yeses and 5 nos, so the baseline

  • 02:29

    probability of the hypothesis "play equals yes" is 9/14 and "play equals no" is 5/14.

  • 02:38

    What this equation says is how to update that probability Pr[H] when you see some evidence,

  • 02:44

    to get what's call the "a posteriori" probability of H, that means after the evidence.

  • 02:51

    The evidence in our case is the attribute values of an unknown instance. That's E.

  • 03:01

    That's Bayes Theorem.

  • 03:02

    Now, what makes this method "naive"? The naive assumption is -- I've said it before -- that the

  • 03:08

    evidence splits into parts that are statistically independent.

  • 03:13

    The parts of the evidence in our case are the four different attribute values in the

  • 03:19

    weather data.

  • 03:20

    When you have independent events, the probabilities multiply, so Pr[H|E],

  • 03:28

    according to the top equation, is the product of Pr[E|H] times the prior probability

  • 03:33

    Pr[H] divided by Pr[E].

  • 03:37

    Pr[E|H] splits up into these parts: Pr[E1|H],

  • 03:43

    the first attribute value; Pr[E2|H], the second attribute value; and so on for all

  • 03:48

    of the attributes.

  • 03:51

    That's maybe a bit abstract, let's look at the actual weather data.

  • 03:56

    On the right-hand side is the weather data.

  • 03:59

    In the large table at the top, we've taken each of the attributes.

  • 04:03

    Let's start with "outlook". Under the "yes" hypothesis and the "no" hypothesis, we've looked at

  • 04:09

    how many times the outlook is "sunny".

  • 04:11

    It's sunny twice under yes and 3 times under no.

  • 04:14

    That comes straight from the data in the table.

  • 04:18

    Overcast.

  • 04:19

    When the outlook is overcast, it's always a "yes" instance, so there were 4 of those,

  • 04:25

    and zero "no" instances.

  • 04:26

    Then, rainy is 3 "yes" instances and 2 "no" instances.

  • 04:31

    Those numbers just come straight from the data table given the instance values.

  • 04:35

    Then, we take those numbers and underneath we make them into probabilities.

  • 04:40

    Let's say we know the hypothesis.

  • 04:43

    Let's say we know it's a "yes".

  • 04:46

    Then the probability of it being "sunny" is 2/9ths, "overcast" is 4/9ths, and "rainy" 3/9ths,

  • 04:52

    simply because when you add up 2 plus 4 plus 3 you get 9.

  • 04:56

    Those are the probabilities.

  • 04:59

    If we know that the outcome is "no", the probabilities are "sunny" 3/5ths, "overcast" 0/5ths, and "rainy"

  • 05:06

    2/5ths.

  • 05:08

    That's for the "outlook" attribute.

  • 05:11

    That's what we're looking for, you see, the probability of each of these attribute values

  • 05:18

    given the hypothesis H.

  • 05:21

    The next attribute is temperature, and we just do the same thing with that to get the

  • 05:25

    probabilities of the 3 values -- hot, mild, and cool -- under the "yes" hypothesis or the

  • 05:30

    "no" hypothesis.

  • 05:32

    The same with humidity and windy. Play, that's the prior probability -- Pr[H].

  • 05:39

    It's "yes" 9/14ths of the time, "no" 5/14ths of the time, even if you don't know anything about

  • 05:45

    the attribute values.

  • 05:47

    The equation we're looking at is this one below, and we just need to work it out.

  • 05:52

    Here's an example.

  • 05:54

    Here's an unknown day, a new day.

  • 05:56

    We don't know what the value of "play" is, but we know it's sunny, cool, high, and windy.

  • 06:05

    We can just multiply up these probabilities.

  • 06:07

    If we multiply for the yes hypothesis, we get 2/9th times 3/9ths times 3/9ths times

  • 06:13

    3/9ths -- those are just the numbers on the previous slide Pr[E1|H], Pr[E2|H], Pr[E3|H]

  • 06:22

    Pr[E4|H] -- finally Pr[H], that is 9/14ths.

  • 06:28

    That gives us a likelihood of 0.0053 when you multiply them.

  • 06:36

    Then, for the "no" class, we do the same to get a likelihood of 0.0206.

  • 06:44

    These numbers are not probabilities.

  • 06:46

    Probabilities have to add up to 1.

  • 06:48

    They are likelihoods.

  • 06:49

    But we can get the probabilities from them by using a straightforward technique of normalization.

  • 06:55

    Take those likelihoods for "yes"

  • 06:56

    and "no" and we normalize them as shown below to make them add up to 1.

  • 07:02

    That's how we get the probability of "play" on a new day with different attribute values.

  • 07:10

    Just to go through that again.

  • 07:11

    The evidence is "outlook" is "sunny", "temperature" is "cool", "humidity" is "high", "windy" is "true" --

  • 07:17

    and we don't know what play is.

  • 07:19

    The [likelihood] of a "yes", given the evidence is the product of those 4 probabilities -- one

  • 07:26

    for outlook, temperature, humidity and windy -- times the prior probability, which is

  • 07:33

    just the baseline probability of a "yes".

  • 07:37

    That product of fractions is divided by Pr[E].

  • 07:40

    We don't know what Pr[E] is, but it doesn't matter, because we can do the same calculation

  • 07:45

    for Pr[E] of "no", which gives us another equation just like this, and then we can calculate

  • 07:52

    the actual probabilities by normalizing them so that the two probabilities add up to 1.

  • 07:56

    Pr[E] for "yes" plus Pr[E] for "no" equals 1.

  • 08:02

    It's actually quite simple when you look at it in numbers, and it's simple when you look

  • 08:07

    at it in Weka, as well.

  • 08:09

    I'm going to go to Weka here, and I'm going to open the nominal weather data,

  • 08:15

    which is here.

  • 08:19

    We've seen that before, of course, many times.

  • 08:22

    I'm going to go to Classify.

  • 08:25

    I'm going to use the NaiveBayes method.

  • 08:29

    It's under this bayes category here.

  • 08:30

    There are a lot of implementations of different variants of Bayes.

  • 08:34

    I'm just going to use the straightforward NaiveBayes method here.

  • 08:38

    I'll just run it.

  • 08:42

    This is what we get.

  • 08:44

    The success probability calculated according to cross-validation.

  • 08:48

    More interestingly, we get the model.

  • 08:51

    The model is just like the table I showed you before divided under the "yes" class and

  • 08:56

    the "no" class.

  • 08:58

    We've got the four attributes -- outlook, temperature, humidity, and windy -- and then,

  • 09:04

    for each of the attribute values, we've got the number of times that attribute value appears.

  • 09:10

    Now, there's one little and important difference between this table and the one I showed you before.

  • 09:15

    Let me go back to my slide and look at these numbers. before.

  • 09:15

    Let me go back to my slide and look at these numbers.

  • 09:18

    You can see that for outlook under "yes" on my slide, I've got 2, 4, and 3, and Weka has

  • 09:26

    got 3, 5, and 4.

  • 09:29

    That's 1 more each time for a total of 12, instead of a total of 9.

  • 09:35

    Weka adds 1 to all of the counts.

  • 09:39

    The reason it does this is to get rid of the zeros.

  • 09:42

    In the original table under outlook, under "no", the probability of overcast given "no" is

  • 09:50

    zero, and we're going to be multiplying that into things.

  • 09:53

    What that would mean in effect, if we took that zero at face value, is that the probability

  • 09:58

    of the class being "no" given any day for which the outlook was overcast would be zero.

  • 10:06

    Anything multiplied by zero is zero.

  • 10:09

    These zeros in probability terms have sort of a veto over all of the other numbers, and

  • 10:13

    we don't want that.

  • 10:14

    We don't want to categorically conclude that it must be a "no" day on a basis that it's overcast,

  • 10:21

    and we've never seen an overcast outlook on a "no" day before.

  • 10:26

    That's called a "zero-frequency problem", and Weka's solution -- the most common solution

  • 10:30

    -- is very simple, we just add 1 to all the counts.

  • 10:34

    That's why all those numbers in the Weka table are 1 bigger than the numbers in the table

  • 10:39

    on the slide.

  • 10:42

    Aside from that, it's all exactly the same.

  • 10:45

    We're avoiding zero frequencies by effectively starting all counts at 1 instead of starting

  • 10:50

    them at 0, so they can't end up at 0.

  • 10:57

    That's the Naive Bayes method.

  • 10:59

    The assumption is that all attributes contribute equally and independently to the outcome.

  • 11:04

    That works surprisingly well, even in situations where the independence assumption is clearly violated.

  • 11:11

    Why does it work so well when the assumption is wrong?

  • 11:13

    That's a good question.

  • 11:15

    Basically, classification doesn't need accurate probability estimates.

  • 11:19

    We're just going to choose as the class the outcome with the largest probability.

  • 11:25

    As long as the greatest probability is assigned to the correct class, it doesn't matter if

  • 11:29

    the probability estimates are all that accurate.

  • 11:33

    This actually means that if you add redundant attributes you get problems with Naive Bayes.

  • 11:38

    The extreme case of dependence is where two attributes have the same values, identical

  • 11:44

    attributes.

  • 11:46

    That will cause havoc with the Naive Bayes method.

  • 11:49

    However, Weka contains methods for attribute selection to allow you to select a subset

  • 11:54

    of fairly independent attributes after which you can safely use Naive Bayes.

  • 12:01

    There's quite a bit of stuff on statistical modeling in Section 4.2 of the course text.

  • 12:07

    Now you need to go and do that activity.

  • 12:12

    See you soon!

All

The example sentences of THEOREM in videos (15 in total of 100)

there existential there 's verb, 3rd person singular present a determiner theorem noun, singular or mass in preposition or subordinating conjunction probability noun, singular or mass called verb, past participle " bayes proper noun, singular theorem verb, non-3rd person singular present " after preposition or subordinating conjunction this determiner guy noun, singular or mass thomas proper noun, singular bayes proper noun, singular from preposition or subordinating conjunction the determiner
the determiner central proper noun, singular limit proper noun, singular theorem proper noun, singular underpins verb, 3rd person singular present the determiner following verb, gerund or present participle formula noun, singular or mass for preposition or subordinating conjunction the determiner confidence noun, singular or mass interval noun, singular or mass of preposition or subordinating conjunction a determiner mean noun, singular or mass .
and coordinating conjunction if preposition or subordinating conjunction you personal pronoun go verb, non-3rd person singular present back adverb to to the determiner theorem noun, singular or mass , the determiner theorem noun, singular or mass tells verb, 3rd person singular present you personal pronoun that preposition or subordinating conjunction the determiner function noun, singular or mass f proper noun, singular of preposition or subordinating conjunction x proper noun, singular , this determiner function noun, singular or mass f proper noun, singular of preposition or subordinating conjunction x proper noun, singular ,
well adverb so preposition or subordinating conjunction the determiner cpt proper noun, singular theorem noun, singular or mass tells verb, 3rd person singular present us personal pronoun and coordinating conjunction we personal pronoun 'll modal get verb, base form back adverb to to that preposition or subordinating conjunction real adjective soon adverb
and coordinating conjunction the determiner theorem noun, singular or mass says verb, 3rd person singular present that preposition or subordinating conjunction the determiner three cardinal number centers noun, plural of preposition or subordinating conjunction a determiner triangle noun, singular or mass - circumcenter proper noun, singular , medicenter proper noun, singular , and coordinating conjunction orthocenter proper noun, singular -
i personal pronoun mean verb, non-3rd person singular present , if preposition or subordinating conjunction you personal pronoun look verb, non-3rd person singular present at preposition or subordinating conjunction bayes proper noun, singular theorem noun, singular or mass as preposition or subordinating conjunction a determiner statement noun, singular or mass about preposition or subordinating conjunction proportions noun, plural proper noun, singular proportions noun, plural
which wh-determiner is verb, 3rd person singular present very adverb similar adjective to to the determiner rolle proper noun, singular 's possessive ending theorem proper noun, singular , expect verb, non-3rd person singular present that preposition or subordinating conjunction the determiner rolle proper noun, singular 's possessive ending theorem proper noun, singular is verb, 3rd person singular present specifically adverb saying verb, gerund or present participle that preposition or subordinating conjunction
this determiner leads verb, 3rd person singular present us personal pronoun to to what wh-pronoun is verb, 3rd person singular present called verb, past participle noether proper noun, singular s proper noun, singular theorem noun, singular or mass developed verb, past participle by preposition or subordinating conjunction emmy proper noun, singular noether proper noun, singular around preposition or subordinating conjunction 1915 cardinal number .
pythagorean proper noun, singular theorem noun, singular or mass for preposition or subordinating conjunction this determiner triangle noun, singular or mass here adverb we personal pronoun will modal be verb, base form using verb, gerund or present participle the determiner pythagorean proper noun, singular theorem noun, singular or mass again adverb you personal pronoun
theorem verb, base form you personal pronoun derived verb, past tense the determiner energy noun, singular or mass momentum noun, singular or mass tensor noun, singular or mass of preposition or subordinating conjunction the determiner form noun, singular or mass t proper noun, singular mu proper noun, singular nu proper noun, singular i personal pronoun 'm verb, non-3rd person singular present using verb, gerund or present participle
barely adverb remember verb, base form that determiner theorem noun, singular or mass but coordinating conjunction i personal pronoun know verb, non-3rd person singular present , this determiner has verb, 3rd person singular present got verb, past participle something noun, singular or mass to to do verb, base form with preposition or subordinating conjunction it personal pronoun .
same adjective principles noun, plural you personal pronoun might modal find verb, base form in preposition or subordinating conjunction gauss proper noun, singular law noun, singular or mass proper noun, singular a determiner theorem noun, singular or mass about preposition or subordinating conjunction electromagnetism noun, singular or mass - albeit proper noun, singular with preposition or subordinating conjunction
the determiner second adjective guy noun, singular or mass said verb, past tense proper noun, singular kind noun, singular or mass of preposition or subordinating conjunction like preposition or subordinating conjunction the determiner first adjective continuing verb, gerund or present participle with preposition or subordinating conjunction the determiner emotional adjective impact noun, singular or mass theorem noun, singular or mass proper noun, singular
so adverb hawking verb, gerund or present participle postulated verb, past tense an determiner analogous adjective theorem noun, singular or mass for preposition or subordinating conjunction black adjective holes noun, plural , and coordinating conjunction it personal pronoun is verb, 3rd person singular present called verb, past participle the determiner second adjective
we personal pronoun 've verb, non-3rd person singular present got verb, past participle a determiner pythagorean proper noun, singular theorem proper noun, singular - i personal pronoun do verb, non-3rd person singular present n't adverb really adverb need noun, singular or mass the determiner pythagorean proper noun, singular theorem proper noun, singular but coordinating conjunction , hey interjection ,

Use "theorem" in a sentence | "theorem" example sentences

How to use "theorem" in a sentence?

  • It is not so much whether a theorem is useful that matters, but how elegant it is.
    -Stanislaw Ulam-
  • The Mean Value Theorem is the midwife of calculus - not very important or glamorous by itself, but often helping to deliver other theorems that are of major significance.
    -Edward Mills Purcell-
  • To insure the adoration of a theorem for any length of time, faith is not enough, a police force is needed as well.
    -Albert Camus-
  • The great poem and the deep theorem are new to every reader, and yet are his own experiences, because he himself recreates them.
    -Jacob Bronowski-
  • The world is anxious to admire that apex and culmination of modern mathematics: a theorem so perfectly general that no particular application of it is feasible.
    -George Polya-
  • Thus, be it understood, to demonstrate a theorem, it is neither necessary nor even advantageous to know what it means.
    -Henri Poincare-
  • The axiom of conditioned repetition, like the binomial theorem, is nothing but a piece of insolence.
    -Edward Abbey-
  • [On the Gaussian curve, remarked to Poincaré:] Experimentalists think that it is a mathematical theorem while the mathematicians believe it to be an experimental fact.
    -Gabriel Lippmann-

Definition and meaning of THEOREM

What does "theorem mean?"

/ˈTHēərəm/

noun
A statement that can be proved.

What are synonyms of "theorem"?
Some common synonyms of "theorem" are:
  • proposition,
  • hypothesis,
  • postulate,
  • thesis,
  • assumption,
  • deduction,
  • statement,
  • rule,
  • formula,
  • principle,

You can find detailed definitions of them on this page.