06
文本
████ 重点词汇
████ 难点词汇
████ 生僻词
████ 词组 & 惯用语
[学习本文需要基础词汇量:
[本次分析采用基础词汇量:
. Hey, everyone. Welcome back.
Um, this is, people can hear me okay? .
So if as usual,
you can take a second to enter your,
uh, SU ID so we know who's here.
Um, so today's lecture will be a Choose Your Own Adventure lecture, um.
So I think, ,
by now you've learned a lot about the, um,
technical aspects of building learning and then,
in the third course, uh,
in the third set of modules,
you saw some of the principles for
Learning and how best to use these tools,
um, in order to be efficient in how you build a machine learning application.
What I want to do today is a step through with you,
a complicated machine learning application,
and, um, throughout all of today's lecture,
I'm gonna, ,
step you through a scenario and then ask you to kind of choose your own adventure.
Because if you have to work on this project,
what are you gonna do, right?
Um, and to give you more of that practice in the next, um,
what, hour and a bit that we have, uh,
on thinking through machine learning strategy, um.
And you know I've,
I've seen in so many projects, uh, there are,
there are sometimes things that
a less sophisticated team will take a year to do,
but if you're actually very strategic and very
sophisticated in deciding what you will do next, right,
how to drive a project forward,
I've seen many times that what a different team will take to do,
maybe you could do it in a month or two, right.
And, , if you're trying to, um,
, write a research paper, or build a business, or build a product,
the, the, the ability to drive
a machine learning project quickly gives you a huge advantage and just,
, you're making much more efficient use of your life as well, right.
Um, so, in, for today I'd like to,
er, er, I'm gonna pose a scenario,
pose a machine learning application and say,
, I mean you are the CEO of this project, what are you going to do next?
So, but I'd like to have today's meeting be quite interactive as well.
So, can I get people to sit in groups of two,
and ideally three or so,
maybe one and
try to sit next to someone that you don't work with all the time.
Er, so, so, if you're sitting,
sitting next to your best friend,
I'm glad your best friend is in the class with you,
but go sit with someone else because I think,
um - I've done this multiple times and the discussion's
actually richer if you talk to someone that you don't know super well.
So actually take a second,
introduce yourself and, and, and, and just greet your neighbor, I guess.
So, the example I wanna go through today is actually
a of the example I described briefly,
uh, uh, in the last lecture I taught, uh,
Building a Speech Recognition System, right.
So, remember I briefly motivated this ,
wake word or last time where,
I, I, I, I, I actually have both an
but you know, it's it's a lot of work to
um, and so if you can build a chip,
uh, to sell to say a lamp maker,
to recognize uh, phrases like,
you know - let's say we call the lamp Robert, right, um.
Then you can recognize phases like Robert,
Robert turn off, and you have a little switch to give this thing different names,
you can call it Robert, or,
or
So you can also have
Just give, give your a lamp a name and just say Hey,
Robert,
So, rather than detecting different names and turn on and turn off,
I'm just gonna focus on - just for the technical discussion,
I'm just going to focus on the phrase Robert turn on, er,
but it's kind of the same problem we need to solve like,
four times to give it two names or to turn on and turn off.
So, I'm gonna
so if you wanna call your name Robert and,
um, tell your lamp to turn on, um.
I think I was inspired by one
wrote these um,
and all his robots' name started with R. So,
maybe R is Robot turn on, um.
And so, uh, let's see.
So, let's say that, um,
you are the new CEO of a small
you know, three persons, uh,
and your goal is to build an, um,
is to build a circuit or actually,
your goal is to build a learning
that can recognize this phrase,
so that when someone, you know,
buys this lamp and they say Robert turn on,
then the lamp can turn on, right?
And just focusing on the task of building,
then, you know, to,
to be CEO of the
You need to figure out how to do the
you figure out who are the lamp makers, etc.
So there's all that stuff, but for today,
let's just focus on the machine learning aspect of it, um.
And so my first question to you is very open-ended,
is - but and this is the life of a CEO, right?
You wake up one day and you've just got to decide what to do, um.
But so my first question to you is the open-ended question is,
you're the CEO, uh,
you're gonna shop at work uh, you know,
tomorrow in your
a learning
So, um, so my question is,
what are you gonna do, right?
So take a, take a minute to,
uh, answer that by yourself first.
Uh, don't, don't discuss with your neighbor yet,
but you know, you're gonna shop in your office and,
and then you're gonna start working on
these engineering problems of building a
uh, and, and do this as yourself, right?
Don't, don't, don't pretend the - yeah, this
er, er, er, er,
Just do it. Just say yeah,
I - but I don't think this is a terrible
I, I, I, I, this is not the best idea but I think this could work.
So you're actually welcome to do this.
But let's say you decide to do this and you go
into your office tomorrow, right, what do you do, right?
Why don't you take, um,
why don't you take let's say two minutes to enter an answer,
then we can, then we can discuss.
[NOISE] In fact,
uh, yeah, yes, I,
one thing I really like about the answer was actually
the read exist - read existing literature part, right, um.
In fact when you start a new project, um, uh,
uh, and I think, um,
uh, when you start doing a new project like that,
assuming you've not worked with trigger
reading research papers or reading code on
this problem is actually a very good way to quickly level up your knowledge, um.
And I think that, you know it, it, it turns out that,
uh, in terms of your,
uh, exploration strategy, right, um,
I want to describe to you how I read research papers, um,
uh, which is - so this is, um,
not a good way to review the literature which is if
the x-axis is time and the
what some people will do is find the first research paper and read that until it's done.
And then go and find a second research paper and read that until it's done,
and then go and find
the third research paper and this - because this is a very
of reading research papers and I find
that the more strategic way to to go through these resources,
everything ranging from
lots of good Medium articles that explain things,
right, uh, research papers,
um, right,
Is if you use a parallel exploration process where -
This, this is actually what it feels like when I'm doing research
I'm trying to learn about a new field I'm not that experienced in.
Right, so I've just done a lot of work on trigger
But if I hadn't worked on this before,
then I would probably find you know, three papers.
So again x-axis is time and
kind of
And based on that, you might decide to read that one in greater detail,
and then to add other papers that you start
and maybe find another one that you want to read in great detail,
and then to gradually add new papers to
your reading list and read some to completion and some not to completion.
Um, you, I, I was actually
one of my friends, uh,
a for - former student at
who mentioned that he was wanting to learn about a new topic.
And he, he was,
he told me he's compiling a reading list of 200 research papers that
he wanted to read. That sounds like a lot.
You rarely read 200 papers.
But I think if you read 10 papers,
you have a basic understanding.
If you read 50, you have a pretty decent understanding.
If you read like 100,
I think you have a very good understanding, uh, uh,
of, of, of a few but often this is [NOISE] time well spent. I guess.
Um and ah, some other tips, again this is,
I'm - I'm really thinking if you really are
a CEO of this startup and this is what you wanna do,
what advice would I give you?
Um, ah, ah when you're reading papers ah,
other thing to realize,
one is that uh,
some papers don't make sense.
Right. And this fine.
Uh, uh, even I read some papers I would just go
I don't think that makes sense.
Uh, and, and it's not that
find papers from a decade ago and we learned that half of it was great
and the other half of it you know
was really talking about things that were not that important.
Right, so it's okay.
Uh, uh, authors you know,
usually papers are technically accurate but often what they thought was
important like maybe an author thought
that using batch number was really important for this problem.
But it just turns out not to be the case.
That, that happens a lot.
That happens sometimes.
[NOISE] Um, and I think the other tactic that I see
talking to experts including contacting the authors.
So when I read a paper [NOISE] um, uh,
I don't, I don't bother the authors
unless I've actually like tried to figure it out myself.
Right. But if you actually spend some time trying to understand
the paper and if it really doesn't make sense to you uh,
uh, uh, it's, it's,
it's okay to
And [NOISE] and people are busy.
Maybe there's a 50 percent chance of respond.
And that's okay because it takes you five minutes
to write an
That could be time pretty well spent.
Uh, uh, but, but don't, don't,
don't bother people unless you've tried to do your own work.
I, I usually get a lot of
not feel like they've done their own work an - and I just write, and then yes.
So just don't, don't,
don't bother people unless you've actually tried to do your own work.
[LAUGHTER] Um [NOISE] cool.
So, after um, [NOISE] looking at the literature ah,
and having a base maybe
an open source implementation or getting a sense of the avenue you wanna try.
Oh and it, it turns out the,
the trigger
one literature where there isn't consensus on.
This is a
Right. Despite all the
there, there, there isn't actually consensus in the,
in, in, in the research for me today on like,
this is the best avenue to try.
Um, but so let's say that um, you've read some papers,
you want to start training your first system.
Right. And last time we talked about this we talked a little bit
about how much time you will spend to collect data,
and,
Spend like a day, or maybe two days at most.
So collect your first data sets and start training up a model.
Um, but my next question to you is what data would you collect?
Right. Um, in particular what train
[NOISE] would you collect?
So you've decided on an initial
want to [NOISE] train something to recognize this phrase,
Uh, I think this uh, probably I don't think it's possible
I don't think anyone has collected a data set
with the words Robert turn on and posted it on the Internet.
So you have to collect your own data for
this particular trigger phrase that you wanna use.
But um, you know as CEO of this startup trying to build a
detect the phrase "
Right. So why don't you take,
why don't you again, take on um,
let's say three minutes to write an answer to this.
Yeah. I think this is an interesting one.
Um, Robert turn on over and over and then data
[NOISE] Um, data
uh, is a way to reduce
the
And ah, having worked on this problem I happen to know data
is very useful for this problem.
But if you didn't already know that fact this is
one of the things I would probably not do right away,
because I would train a quick and dirty system
have a high variance problem before investing the effort into data
So data
like it never hurts here or it rarely hurts,
usually helps but I don't bother to make that investment unless you have
collected the evidence that you actually have
a high variance problem and that this is actually a good use of your time.
Right.
Yeah. [NOISE] I think this,
this one actually, this is actually nice.
So
You get it done really quickly.
Um, uh, when I'm working with teams um,
I actually think in terms of hours,
in terms of how long it take us to do something.
So this one you can probably do in like 30 minutes.
Right so you get your data set collected in 30 minutes and get going.
Uh, or, or, or you run around
friends or strangers to speak into your uh, uh, laptop microphone.
You could spend a few hours to get a much bigger data set than possibly at startup.
I probably do that. Probably I should go and collect
data in several hours rather than only spend 30 minutes.
But this is actually pretty interesting as well because it lets you
get it done really quickly. That makes sense?
Right. [NOISE]
So let me actually uh, uh,
share some more concrete advice.
Right. And, and I think I should someti - sometime back um, to,
to prepare a
Ken and
to, to create a homework right, that, that,
that you'll see later in this course. So this is like a [NOISE].
Uh, this
a nice learning example that we're using in a few points throughout this course.
Um, so here's one thing you can do.
Uh, and this, this is actually what's um,
[NOISE] what we did.
Right. Which is uh, collect [NOISE] um,
well
[NOISE] collect a hundred examples [NOISE] of uh,
uh, 10-second
Right? And so, uh,
it turns out once you grab a hold of someone,
uh, and ask them to speak into your microphone, you know,
you can keep them for, um, three seconds
which is how long it takes to say,
or you can keep them for ten seconds which they're actually very
willing to spend the extra seven seconds with you.
Right? Um, but so if this is ten seconds of audio data, you know,
so this is ten seconds of audio, right, and,
and audio is just patterns of,
uh, little changes in air pressure.
Right. So, if you plot audio,
the reason it looks like this
It's just, uh, the,
the way you're hearing my voice is you know,
my voice or the speakers are creating very rapid changes in air pressure and
your ear measures those very rapid changes in
air pressure, interprets the sound and so a microphone, uh,
is a sensitive device for recording these very very high frequency changes in
air pressure and these plots that you see
in audio is just, what is the air pressure at different moments in time.
Right? But so given a, um,
uh, ten second clip like this.
If this is a three-second section,
where they said,
then what you would like to do is to build a desk lamp say,
that can sit here and the lamp is turned off,
And at the moment they finish saying, Robert turn on, yeah, you turn it on.
So this is, uh, output label y really, right?.
So - so what you want to do for the
um, at you know,
pretty much the moment they finished saying Robert turn on, uh,
you once your learning algorithm to output a one, that's your target label y saying,
"Yeah, I just heard this trigger word."
Uh, and for all other times,
you wanted to output zero,
right? Cause - because, uh, and the one is when you
decide to turn on the lamp at that moment in time.
Right? So to collect the data set,
um, here's something you can do,
which is collect 100
of 10 seconds each.
I will really, you know,
look at these numbers and think okay,
Let's say you are running around
uh, uh, maybe 10 people,
10 clips per person or maybe 100 different people.
Um, I will actually estimate, you know,
if you go to
how long does it take to get one person?
You can probably get one person every minute or two,
if you go to a busy place on,
on like the
So you could probably get this done in like, uh,
100 to 200 minutes, like, two or three hours right?
It's not that bad. So you get this done quite quickly.
Um, and so and,
and let's say you collect 100
for the, for the purposes of,
uh, today let's say,
you collect 100
25 for your
um, [NOISE] and zero for the test set.
Right. That's actually not that
a new product to just not have a test set because your goal is, uh,
to build something that convinces, you know,
just early
sometimes they don't bother with a test set.
If, if, it goes to publish a paper then
of course you need a
But if you're just building a product,
then you don't need a
just get started without dealing with a test set, right?
So it's pretty easy to get started.
Um,
[NOISE]
So taking that audio clip from above, um,
one thing you can do to turn this into a supervised learning problem, uh,
is to take - so the,
the phrase Robert turn on can be said in less than three seconds.
So let's say you take three seconds as
Right? So we can do is, uh,
So let's say here was when Robert turn on was said.
So what you can do is, uh,
oh, right, and the target label is zero, zero, zero,
zero, zero, zero, zero one,
zero, zero, zero, zero.
Um, what you can do is then
So here's one audio clip,
you can take that audio clip.
This is X and the target label is zero because,
because was Robert turn on was not said [NOISE].
Um, and you can take on - on this audio clip at different random, each clipped,
three second clip and that clip also has a target label 0.
Um,
for this one right?
Which is a three-second clip that can,
that-that, that ends at,
uh, on, the last part of the on sound you would have a target label of 1.
Right? So, and - and when - when you learn about sequence models, RNNs,
you learn a better method than this explicit clipping.
But for now let's say you take these,
um, audio clips and turn it into, uh,
three - so take a 10 second clip and by clipping
out ran - different windows you can take your,
um, let's say 100, uh - uh, clips.
And because for each ten second clip you can take different windows,
you could turn this into let say,
uh, 3,000 [NOISE] training examples.
Right? So here I took a ten second clip and - and [NOISE] show, you know,
took three different three second windows, but you take 30 three second windows,
then each 10 second audio clip becomes 30 examples.
And now, you've turned a problem into
a
that inputs a three-second clip and labels it as either zero or one.
Right? This make sense?
And so this is an example of, uh,
the - the more complex, uh,
pipelines you might have,
if you're building a learning algorithm.
So take a continuous, you know,
audio
a
learned how to build various
Right? And again, when you learn about RNNs,
you learn about other ways to process sequence data or
Okay. So, um, oh, go ahead.
[
[NOISE] Uh is this
if you have 100 examples, uh,
it's not that hard to just listen to it on your laptop or
some audio playing software to figure out when - when they finished saying,
uh, Robert turn on.
And then at that moment to put a one in the target label right?
Because this is really when you want the lamp to turn on, right?
Makes sense? Cool.
So, um, any other questions?
Actually, feel free to ask clarifying questions, yeah, go ahead.
I, I wonder if this is gonna cause a problem but ones are too
Oh, sure, let me get back to that. Anything else?
Is there a specific reason why you only train them on a few seconds instead of ten seconds, [
[
[NOISE] I see, yeah, why do we do three seconds, or four or five seconds
then there's another
uh, [NOISE]
you have say it really slowly to take a few seconds is this, right?
This is Robert turn on.
And so again is this design choice?
[LAUGHTER] Um, yeah, all right.
So, so, um, let's say you do this,
feed it to a supervised learning algorithm,
train a
and let's say that when you classify this,
ah, when, when you run this algorithm,
you end up with 99.5 percent accuracy [NOISE] right? Um,
uh, but you find that
[NOISE]
Right? Um, and, and what I mean is that whatever audio you give it,
it just output zero all the time.
So
I never heard the phrase "Robert turn on", you know.
So, so, uh, so, uh,
so, so my question to you is,
you know - and by the way,
the reason I'm going through these scenarios is,
um, I found that, uh,
a good way to gain good
and to become good at making these decisions,
is these are the decisions that a project leader,
These are actually like pretty much exactly the decisions you need to make.
And I find that, um,
one of the ways to gain this type of experience is you,
you know, find a job with a good AI team and work with them for five years, right?
And then you actually live through this and you see what they do.
But instead of needing you to go and spend five years to see ten examples of this,
I'm trying to step you through maybe one example in, in one hour.
So, so instead of, uh, you know,
gaining this experience through work experience, which is great,
but takes many, many years [LAUGHTER] or many,
many months hoping to,
you know, let's just put you in the position of making these decisions.
You can learn from them much faster, right?
Um, so - and all the examples I'm giving are actually completely realistic, right?
They're either exactly or very similar to things I have seen in,
in actual, you know, very real projects.
So question is your learning algorithm gives this result,
95 percent accuracy,
Let me mention some of - some of the answers I really liked.
I think that, uh, um, you know,
I - when I think of building learning
the process is often specify a
And then you don't always have to do it,
but it's good
It's just - it is,
uh,
If you have a very clear specification of the problem.
And I think one insight out of this is that if your
Because it's so
that accuracy in your
Because, you know, presumably,
this is 99.5 percent accurate on the dev set as well.
But this performance is terrible.
So it's doing great on a dev set,
on your accuracy
So I think of it as good
You know, this is kind of good sound practice, uh, to,
to just specify, make sure you at least have a dev set and
evaluation
So making the dev set more balanced, uh,
equal numbers of positive and negative would,
would be a good step toward that.
Um, and then I think,
um, uh, you could also,
um - there are a few people that talked about, um,
give the higher weights to the positive examples, right?
So, you know, um, uh,
one way to do this is to
them more
maybe closer to a balanced ratio positive negative examples.
That'd be okay. The other way to not do
we'd just give the positive examples a greater weight, right?
Um, I would probably
Um, another thing you could do, um,
uh, you know, in the,
in the interests of, um, speed,
even if it's not the
most sound thing to do,
is to change the target labels to be a bunch of ones after that.
And this is
But if you've implemented the rest of this code already,
this might be a reasonable,
you know, a little bit
But this is - this, this, this might work well enough.
Right? I would - I might not - I don't know if I would want
to try to write an academic research paper with this method,
maybe you can get away with it.
But this little thing that I think if you tried to publish a paper with this,
academic
you know, maybe this is okay.
But I think if you want something quick and dirty, that just works.
changing a bunch of labels to be ones so that's,
say,
Uh, that ends just a little bit after Robert turn on, that's still labeled one,
that'll be pretty reasonable.
But this will be saying that, uh,
for anywhere [NOISE] within maybe a 0.5 second period after Robert turn on finished,
it's okay to turn on the light
That you kind of wanna be turning on the light.
Turning on the lamp, you know,
say within half a second, right, after Robert turn on is - has been said.
And this would be a not - this would be a way to just get more labels of ones in there.
All right, that makes sense? Yeah?
With like
how does that translate to when you deploy this,
you're not going to see Robert turn on as much, right?
Like one out of 1,000 might be reflected,
but what do we expect to see?
Yeah, yeah. Right. So, um,
I think that, uh - how do we put it?
Um, so if you actually - yes, so uh, this,
this is sort of a dev set and evaluation measure kind of question, right?
So, uh, one of the - a couple of the
uh, when actually working on this, is,
uh, when someone says Robert turn on,
what is the chance that it actually wakes up, or the lamp turns on?
And then the second is, if no one is saying anything to the lamp, you know,
how often does it
So those are the two
And, and sometimes you also
try to combine them in a single number evaluation
Uh, but I think that, um, uh,
you could identify the data sets and measure both of these things.
into a single real number, which I think - yeah.
And I think one of the ways you talked about in the,
in the videos as well as, right?
Does that make sense? Uh, yeah.
But I think -
on what is it that satisfies a user need, right?
And, uh, and just one, one thing about,
uh, the straightforward way of
is that if you don't do this then your whole data set,
just has very few positive examples, right?
Um, and so if you throw away all the negative examples,
so that you cut down the number of negative examples until
you have exactly equal numbers of positives and negatives,
you've actually thrown away a lot of negative examples.
This makes sense? And so one,
one problem with the straightforward way of
is that, you know, in your audio clip,
in your test 10 second clip that we collected by running around Stanford,
um, you have one example of Robert turn on.
And so if you want exactly per,
perfectly balanced positive and negative,
it means that you're allowed to only
You can say, that's negative and that's a positive.
And you can't
So, so if you use,
uh - if you insist on a perfect
you're actually throwing away a lot of negative examples that,
that could be helpful for the learning algorithm.
Great. [NOISE] Um, so all right.
[NOISE] So, um, you know,
a lot of the
uh, building learning
um, uh, building learning
Because what happens in
a typical machine learning
So you figure out what is the problem,
so fix that, uh,
like
And so that fixes the current problem.
And then after fixing the current problem, which,
which is the one we just solved, say,
you then come across a new problem and you have to solve that.
And you fix that problem,
you click somewhere else, another new problem.
So I find that, uh, the
when I'm working on a machine learning project,
it often feels more like software
Because you're often trying to figure out what doesn't work and you're trying to fix that.
And after you fix that problem,
then another bug surfaces and you
and you do that, and another, and you kind of keep doing that until
So if I keep talking about,
you know, your algorithm doesn't work,
what do you do next, right?
That, that's kind of the theme of today's presentation.
But that, that is what the
your day-to-day work of developing
a learning algorithm is usually like because it's like,
it doesn't work, and you fix it.
It still doesn't work, then you fix that,
and it still doesn't work, you fix it.
And you do that enough times until it works, right?
That, that is actually what often working
on a learning algorithm works - look, looks like.
All right. So let's say you fix that problem, um,
and you conclude, uh,
through doing error analysis,
that your algorithm is
So you know, you've - you've added a lot more ones,
so the
So let's just add a bunch of ones like I did on that previous board,
so let's just add a lot of ones here,
so the
um, right, okay, good.
Let's say that - sorry [NOISE] ,
see, too many pages of notes here.
Okay, good. So let's say that you find that it achieves now 98 percent
accuracy on training and 50 percent accuracy on the dev set, right?
So very large gap between your training and your dev set performance,
and so a clear sign of
And so I think of one of the earlier questions,
someone talked about data
and so we have this clear sign of
this is a good time to consider data augmentation, right?
So for audio, this is how you could do data augmentation,
which is, um, collect a bunch of background audio.
You know, so I guess if you're trying to build a lab that might go into people's homes,
then you could go into your friends' homes and,
you know, with their permission, record, right?
What the background sound of their home looks like.
You know, maybe there are people talking in the background,
maybe the TV on in the background.
Well, whatever goes on in people - people's homes.
Um, and then, it turns out that if you take a, um, say,
a one second clip,
of Robert turn on, or RTO,
and you add that to a background clip,
then you can
what it sounds like in your friend's house if someone were
to suddenly pop up and say Robert turn on
against the background sound of your friend's house, right?
And - and it turns out that, um,
if you want to make the system
so actually, for example,
have a - I know - I - I - I actually know someone that lives, unfortunately,
close near to a train station and so
their house actually has a lot of train station noise from the
And so what you can do to make your system more
also take
Like
and if you take that noise and take,
uh, in this case, let's say one-second,
one-second of a three-second clip of someone saying Robert turn
on and you
then what you end up with is a 10 second clip of someone saying Robert
turn on against the noisy train in the background type of noise.
All right. And so in order to do data augmentation or data synthesis,
you can take some one-second clips of people saying Robert turn on in
a quiet background and then take
some one second clip of people saying random words, right?
Since you're Stanford, and
the train noise background and then you would have, in this case,
you would have what sounds like train noise,
And then you could generate the labels now as zeros there,
ones there and then zeros there, right?
Because if this is what it actually sounded like in a user's home,
then you want the lamp to turn on after Robert turn on but not after these random words.
So you can pick different random words.
Great. Um, so let's see.
Great.
like you to do is evaluate three different possible ways,
um, to collect noisy data, right?
To - to - to collect this type of background data, right?
Um, and so what I'd like you to do for the next question is let's say you and your team,
you know, have, ah,
Um,
of background noise data and let's say you've decided
that you would like to collect 10 hours of background noise data, right?
Okay? So I'm going to present to you three options.
One is um, you know,
run around Stanford and place
do this with consent and don't - don't, you know,
California as you - you - you're not supposed to -
don't record people without their knowledge and consent, right?
Second is
It turns out if you go to
you know, rain noise or cars driving around.
So you actually - and again, if you do that,
find something that's Creative Commons and sort of
Another thing you could do is, ah,
use
Where you can have people all around the world be paid,
you know, modest amounts of money to submit audio clips, right?
So for the next exercise what I want you to do because, um,
and I want that this exercise of discipline which is,
what I want you to do is, um,
I want you to estimate. Let's see.
What time is it now?
Okay, it's 12:30 PM right now.
What I want you to do is write down three numbers in
the next exercise to estimate if you were to do this,
you know, let's say you were to go do this right now, right?
By what time will you have finished if you were to do option one?
What time would you finish if you were to do option two?
What time would you finish if you were to do option three?
If your goal is to collect 10 hours of data through one of these mechanisms.
Does that make sense? So it's 12:30 PM now.
So what I'd like you to do is just write down three numbers.
First number is what time is it,
what time would it be by the time you collected 10 hours of data,
you know, from around Stanford.
What time would it be right - and if you could do this
in - so if you think you would do it by tonight,
then write 09:00 PM.
If you think it'll do, if you think it will take you one week,
that write the date one week from now,
right? Whatever it is.
But just write down three numbers of these three activities, okay?
Why don't you do this one relatively quickly?
Can people do this in like maybe a minute and a half?
All right, cool. This is interesting, um.
Yeah, whether people are, actually this is a surprisingly large
I'll mention one thing that um, surprised me.
Um, I'll give you my own assessment.
I think that uh, you
know when I'm leading startup teams we tend to be very
And so I think that um,
if the goal is to collect 10 hours of data,
if you have three friends who have a laptop you can collect
three hours of data per hour because you got three recordings going
So if I were doing this with say two other friends you know,
I bet, I bet we could get this done by tonight, right?
Because if you need nine hours of data that's each person needs to collect
three hours of data and you run around Stanford and you keep the microphones running,
I bet, I bet I could get this done by 6:00 PM right maybe,
maybe even earlier I don't know.
is actually,
Um, it turns out one
uh, I think a lot of the you,
the - there are people that um,
have trouble sleeping at night so they listen to highway noise or whatever.
And so there are these you know 20 hours of
highway clips highway noise on
But I, I don't know how those clips are generated.
And I suspect a lot of them loop, right?
Meaning it's the same one hour played over and over.
So I actually think it's harder than, than,
than one like, yes they get 10 hours of
um [NOISE] non-repetitive data and it's one of those things you know,
if I take an hour of highway sound and loop
it you can't tell the difference because all highway sound sounds the same.
I just can't tell one minute of highway sound from another one
but if you have one hour of highway sound looped 10 times,
the learning algorithm will actually perform much less well
than if you have 10 hours of fresh highway sound.
So this I would actually have a harder time doing.
I think I've probably I,
I, I would
because of these problems I would probably budget until sometime tomorrow, right?
May - may - maybe, maybe 9:00 PM or something. Maybe that's
I'm not sure. Um, the one surprise
to me was some people thought they could do this by tonight.
Uh, I, a - again I've used
a huge process to set up
Um, and especially to get them on microphone.
Uh, uh, I don't know if you implement something
on Flash that can speak in their
[NOISE] And then Flash isn't supported, it's,
it's, actually, it's actually not that easy to get a lot of
Turkers to do this and the global supply of [NOISE] Turkers is also
So I would, if I were doing this I would
probably [NOISE] I don't know maybe a week or something, right?
Hard to say, I'm not sure um.
But so the specific opinion isn't that important but I want
you to go through this exercise because this is how um,
efficient startup teams should you know,
think it'll take to do these things and I think uh,
you can have a debate about how high quality the data is, I
think you can get very high quality data from this and from this.
Uh, [NOISE] I, I, I just don't trust a lot of those online
Uh, but this is really fast and you can get pretty high-quality data.
I would probably do this to collect the background sound to get going, right?
But I think that part of their
fast-moving teams is pretty much exactly what you did.
Which one, what is that the exercise of
estimating or what time can we get this done and then use that to pick an option, right?
Um, and then I wanna just mention one last thing uh,
which is that [NOISE] these differences matter, right?
Um, you know I've actually built, I've built a lot systems, built a lot of machine learning systems.
But um, oh and,
and I think by the way if you do everything we just
described and you'll see this later in a problem set.
Uh, you can actually with this set of
ideas pretty much this set of ideas that we just went through today.
You can actually build, build a,
build a pretty decent trigger word
or wake registering word
In fact we'll
But now you know when you get to that homework exercise when you do RNNs uh,
you know how you could come up with this sort of process yourself if,
if you didn't already know how to make these types of choices. Yeah, question
Just one question. Uh, I conduct my research [
how my
at the beginning it seems like it's not important like my
So when I
it, my, what mess my result but what, what you have to think about it?
Yeah so my advice
does the microphone affect your result, right?
[OVERLAPPING] My, my advice is would be to uh,
get something going quick and dirty and then develop a data set, right,
with the actual types of data you can develop on
your real microphone and then see if there is a problem and it may be er,
different microphones do have different characteristics.
And if it is a problem then go back and think about
how you collect data that's more representative how you
test. I wanna mention one more quick thing, you were handed
class surveys and wants to do something real quick which is um,
I wanna tell you why these things really matter which is um,
if this is uh, performance, right?
Uh, uh,
And if this is today and you're the CEO of a startup remember that's,
that's what we're doing in this lesson.
And this is six months from now,
but this is 12 months from now.
Um, you know maybe if a competitor,
actually maybe, maybe I don't know.
[LAUGHTER] Maybe because we've talked about this so much in
this class maybe two of you in this class are gonna build a startup and be a competitor.
Um, but over time most machine learning teams,
you know the error actually goes down over time as you work on problems,
very good and this is what I see in
You know, we work on project, improve the system,
and the error actually goes down over time as you work on
this over the next 12 months say but if you're really a CEO of
a startup doing this [NOISE] and it turns out that it's
the
Um, don't do something that takes you two days.
If you can get a similar result in one day.
The difference is not that you're one day slower,
the difference is that you are 2x faster, right?
And then, then having that
this whole chart and
Um then you [NOISE] want to
be this startup that you know makes
the same amount of process in six months instead of 12 months, right?
And because uh, if you're able to do this then
your startup will actually perform much better in
Assuming your accuracy's important which it seems to be for wake word.
And so don't think of this as saving you a day here
and there think of this as making your team twice as fast.
And that's the difference between this level
of performance and that level of performance.
So that's why when I'm, you know building teams to
and executing these projects I tend to be pretty
making sure we're very efficient in exploring the options
and [NOISE] don't wait till tomorrow to collect data of
data by today because the difference is not that you wasted 12 hours,
the difference you are twice as slow as a company, right?
So I think uh, so hopefully through this example on
your ongoing experiences throughout this course it can help
you continue to get better at this.
Right. Um, last thing we want to do [NOISE] was uh,
we're about
we went to hand out a survey um,
an anonymous survey uh,
to get some feedback from you about this class.
And whenever we get these surveys uh,
we end up uh, uh,
thanks to previous generations of students' feedback.
We've already been gradually making class better.
So I think Ken and I actually read all of these questions ourselves and try to find
ways to take your feedback to improve the class so if you can take you know, five minutes um,
for this survey and you can hand it in just drop it off
Uh, we're very grateful for your suggestions.
Okay?
you can still do so but uh, that's it for today.
So please fill out the survey and
drop it off back in front then we'll wrap up. Okay, thank you.
知识点
重点词汇
embedded [ɪm'bedɪd] v. 嵌入(embed的过去式和过去分词形式) adj. 嵌入式的;植入的;内含的 { :6007}
appropriately [ə'prəʊprɪətlɪ] adv. 适当地;合适地;相称地 { :6089}
continuation [kənˌtɪnjuˈeɪʃn] n. 继续;续集;延长;附加部分;扩建物 {toefl gre :6109}
detection [dɪˈtekʃn] n. 侦查,探测;发觉,发现;察觉 {cet4 cet6 gre :6133}
detections [dɪ'tekʃnz] n. 察觉,发觉,侦查( detection的名词复数 ) { :6133}
downloaded [ˈdaunləudid] vt. [计] 下载 { :6382}
download [ˌdaʊnˈləʊd] vt. [计] 下载 {gk :6382}
downloading ['daʊnləʊdɪŋ] n. 下装,[计] 下载;[计] 下传 v. [计] 下载(download的现在分词形式) { :6382}
tricky [ˈtrɪki] adj. 狡猾的;机警的 { :6391}
robust [rəʊˈbʌst] adj. 强健的;健康的;粗野的;粗鲁的 {cet6 ky toefl ielts gre :6419}
hygiene [ˈhaɪdʒi:n] n. 卫生;卫生学;保健法 {toefl ielts gre :6492}
randomly ['rændəmlɪ] adv. 随便地,任意地;无目的地,胡乱地;未加计划地 { :6507}
rigorous [ˈrɪgərəs] adj. 严格的,严厉的;严密的;严酷的 {cet6 ky toefl ielts :6606}
uncommon [ʌnˈkɒmən] adj. 不寻常的;罕有的 adv. 非常地 { :6725}
unlimited [ʌnˈlɪmɪtɪd] adj. 无限制的;无限量的;无条件的 {cet6 :6742}
inaudible [ɪnˈɔ:dəbl] adj. 听不见的;不可闻的 { :6808}
algorithm [ˈælgərɪðəm] n. [计][数] 算法,运算法则 { :6819}
algorithms [ˈælɡəriðəmz] n. [计][数] 算法;算法式(algorithm的复数) { :6819}
dubious [ˈdju:biəs] adj. 可疑的;暧昧的;无把握的;半信半疑的 {cet6 ky toefl ielts gre :6855}
temporal [ˈtempərəl] n. 世间万物;暂存的事物 adj. 暂时的;当时的;现世的 n. (Temporal)人名;(法)唐波拉尔 {toefl gre :6868}
squash [skwɒʃ] n. 壁球;挤压;咯吱声;南瓜属植物;(英)果汁饮料 vt. 镇压;把…压扁;使沉默 vi. 受挤压;发出挤压声;挤入 {cet6 toefl gre :6923}
simplifying [ˈsimplifaiŋ] v. 简约,简化(simplify进行时形式) { :7074}
cafeteria [ˌkæfəˈtɪəriə] n. 自助餐厅 {gk cet4 cet6 ky ielts :7076}
Turk [tә:k] n. 土耳其马;土耳其人 { :7202}
cardinal [ˈkɑ:dɪnl] adj. 主要的,基本的;深红色的 n. 红衣主教;枢机主教;鲜红色;【鸟类】(北美)主红雀 {ky toefl ielts gre :7343}
binary [ˈbaɪnəri] adj. [数] 二进制的;二元的,二态的 { :7467}
rainier [ ] n. 雷尼尔山;赤阳苹果 n. (Rainier)人名;(英)雷尼尔;(法)兰尼埃 { :7477}
validate [ˈvælɪdeɪt] vt. 证实,验证;确认;使生效 {toefl gre :7516}
sharpens [ˈʃɑ:pənz] v. (使)提高( sharpen的第三人称单数 ); (使声音)变得尖锐; (使)变得更好(或技术更高、更有效等); (使)变得锋利 { :7542}
blog [blɒg] n. 博客;部落格;网络日志 { :7748}
intuitions [ˌɪntjuˈiʃənz] n. 直觉( intuition的名词复数 ); 凭直觉感知的知识; 直觉力 { :7905}
hypothetical [ˌhaɪpəˈθetɪkl] adj. 假设的;爱猜想的 {gre :8049}
moderately [ˈmɒdərətli] adv. 适度地;中庸地;有节制地 {cet6 :8132}
berkeley ['bɑ:kli, 'bә:kli] n. 贝克莱(爱尔兰主教及哲学家);伯克利(姓氏);伯克利(美国港市) { :8189}
isaac ['aizәk] n. 以撒(希伯来族长, 犹太人的始祖亚伯拉罕和萨拉的儿子);艾萨克(男人名) { :8205}
hack [hæk] n. 砍,劈;出租马车 vt. 砍;出租 vi. 砍 n. (Hack)人名;(英、西、芬、阿拉伯、毛里求)哈克;(法)阿克 {gre :8227}
reviewers [rɪv'ju:əz] n. 评论者(reviewer的复数);评审员 { :8282}
tha [,ti ɛtʃ 'e] abbr. thaumatin 竹芋蛋白 { :8395}
variability [ˌveəriəˈbɪləti] n. 可变性,变化性;[生物][数] 变异性 {toefl :8403}
skimming ['skɪmɪŋ] n. 撇取浮沫;浮渣 v. 撇去…的浮物(skim的ing形式) { :8515}
skim [skɪm] n. 撇;撇去的东西;表层物;瞒报所得的收入 adj. 脱脂的;撇去浮沫的;表层的 vt. 略读;撇去…的浮物;从…表面飞掠而过;去除;(为逃税而)隐瞒(部分收入) vi. 浏览;掠过 {cet4 cet6 ky toefl ielts gre :8515}
augment [ɔ:gˈment] n. 增加;增大 vt. 增加;增大 vi. 增加;增大 {cet6 ky toefl ielts gre :8589}
neural [ˈnjʊərəl] adj. 神经的;神经系统的;背的;神经中枢的 n. (Neural)人名;(捷)诺伊拉尔 { :9310}
compress [kəmˈpres] vt. 压缩,压紧;精简 vi. 受压缩小 {cet4 cet6 ky toefl ielts gre :9510}
sparse [spɑ:s] adj. 稀疏的;稀少的 {toefl ielts gre :9557}
browser [ˈbraʊzə(r)] n. [计] 浏览器;吃嫩叶的动物;浏览书本的人 { :9689}
nope [nəʊp] adv. 不是,没有;不 { :9734}
robotic [rəʊˈbɒtɪk] n. 机器人学 adj. 机器人的,像机器人的;自动的 { :10115}
metric [ˈmetrɪk] adj. 公制的;米制的;公尺的 n. 度量标准 {cet4 cet6 ky ielts :10163}
metrics ['metrɪks] n. 度量;作诗法;韵律学 { :10163}
obsessive [əbˈsesɪv] adj. 强迫性的;着迷的;分神的 { :10199}
strategically [strə'ti:dʒɪklɪ] adv. 战略性地;战略上 { :10245}
micro [ˈmaɪkrəʊ] adj. 极小的;基本的;微小的 n. 微型计算机;微处理器 { :10740}
sequential [sɪˈkwenʃl] adj. 连续的;相继的;有顺序的 {gre :10797}
synthesized ['sɪnθɪsaɪzd] adj. 合成的;综合的 v. 合成(synthesize的过去分词);综合 { :10905}
synthesize [ˈsɪnθəsaɪz] vt. 合成;综合 vi. 合成;综合 {cet6 toefl :10905}
wha [ ] [医][=warmed,humidified air]温暖、潮湿的空气 { :11046}
whack [wæk] n. 重击;尝试;份儿;机会 vt. 重打;猛击;击败;削减 vi. 重击 { :11376}
manually ['mænjʊəlɪ] adv. 手动地;用手 {toefl :12167}
amazon ['æməzən] 亚马逊;古希腊女战士 { :12482}
configure [kənˈfɪgə(r)] vt. 安装;使成形 { :13210}
prioritizing [praiˈɔritaizɪŋ] v. 目标优选;指定优先权;依主次程序排列(prioritize的ing形式) { :13446}
rigorously ['rɪɡərəslɪ] adv. 严厉地;残酷地 { :13742}
unbalanced [ˌʌnˈbælənst] adj. 不平衡的;错乱的;不稳定的;收支不平衡的,未决算的 v. 使失去平衡;使(精神)错乱(unbalance的过去分词) { :14269}
anonymously [ə'nɒnɪməslɪ] adv. 不具名地;化名地 { :14502}
proportionate [prəˈpɔ:ʃənət] adj. 成比例的;相称的;适当的 vt. 使成比例;使相称 { :15394}
mathematically [ˌmæθə'mætɪklɪ] adv. 算术地,数学上地 { :15474}
oscillate [ˈɒsɪleɪt] vt. 使振荡;使振动;使动摇 vi. 振荡;摆动;犹豫 {ielts gre :15486}
circuitry [ˈsɜ:kɪtri] n. 电路;电路系统;电路学;一环路 { :15641}
stanford ['stænfәd] n. 斯坦福(姓氏,男子名);斯坦福大学(美国一所大学) { :15904}
难点词汇
SP [ ] abbr. 自行驱动的(Self Propelled);服务提供商 (Service Provider) { :16790}
waveform ['weɪvfɔ:m] n. [物][电子] 波形 { :17729}
labeling ['leɪblɪŋ] n. 标签;标记;[计] 标号 v. 贴标签;分类(label的现在分词) { :17997}
dataset ['deɪtəset] n. 资料组 { :18096}
doable [ˈdu:əbl] adj. 可做的 { :18441}
anytime ['enɪˌtaɪm] adv. 任何时候;无例外地 { :19391}
scrappy [ˈskræpi] adj. 爱打架的;杂凑的;不连贯的;生气勃勃的 {gre :20002}
brainstorm [ˈbreɪnstɔ:m] n. 集思广益;头脑风暴;灵机一动 vt. 集体讨论;集思广益以寻找 vi. 集体讨论;动脑筋;出主意 { :20387}
abbreviate [əˈbri:vieɪt] vt. 缩写,使省略;使简短 vi. 使用缩写词 {toefl gre :20690}
brainstorming [ˈbreɪnstɔ:mɪŋ] n. 集体研讨;发表独创性意见 v. 集思广益以寻找;集体自由讨论(brainstorm的ing形式) { :20927}
microfilms ['maɪkrəʊfɪlmz] n. (拍摄文件等用的)缩微胶卷( microfilm的名词复数 ) { :22448}
augmentation [ˌɔ:ɡmen'teɪʃn] n. 增加,增大;增加物 {gre :22669}
RI [ ] abbr. (美国)罗得岛州 (Rhode Island);剩余收入(residual income) n. (Ri)人名;(日)理(名) { :24818}
rea [ ] abbr. (美)农村电气化管理局(Rural Electrification Administration);铁路快运代办处(Railway Express Agency) n. (Rea)人名;(英)雷;(西、意、罗、瑞典)雷亚 { :25673}
debugging ['di:'bʌgɪŋ] n. 调试以排除故障 v. 排除故障;发现并改正错误(debug的ing形式) { :28755}
dev [dev] abbr. 发展(develop);偏差(deviation);开发人员(developer);设备驱动程序 n. (Dev)人名;(尼、印)德夫 { :28908}
workflow ['wɜ:kfləʊ] n. 工作流,工作流程 { :31107}
prototyping [prəʊtə'taɪpɪŋ] n. [计] 样机研究;原型设计 { :37954}
prob [p'rɒb] abbr. problem 问题; probability 概率; problematic 问题的; problematical 问题的 { :38611}
rebalancing [ ] [财]调整资金组合 { :41158}
mindset [ˈmaɪndset] n. 心态;倾向;习惯;精神状态 { :42826}
eunice [ˈju:nis] n. 尤妮斯(女子名,义为快乐的胜利) { :44408}
生僻词
Asimov [ ] 阿西莫夫(人名)
brainstormed [ˈbreɪnˌstɔ:md] v. 集中各人智慧猛攻( brainstorm的过去式和过去分词 )
caltrain [ ] [网络] 加州铁路;加州火车;加州湾区铁路
email ['i:meɪl] n. 电子信函 vt. 给…发电子邮件 n. (Email)人名;(法)埃马伊 {zk ielts :0}
emails ['iːmeɪl] n. 电子信函 vt. 给…发电子邮件 n. (Email)人名;(法)埃马伊
ght [ ] abbr. growth hormone treatment 生长激素治疗; Goodenough-Harris test 戈-哈试验; general hypoxemic test 普通血氧不足试验; glaucoma hemifield test 青光眼半视野试验
GitHub [ ] [网络] 源码托管;开源项目;控制工具
google [ ] 谷歌;谷歌搜索引擎
hacky ['hækɪ] n. 出租汽车司机
hyperparameter [ ] [网络] 超参数;分别有一个带有超参数
overfitting [ ] n. 过适;[数] 过度拟合 v. 过适(overfit现在分词)
rebalance [re'bæləns] 再平衡
resample [ri:'sæmpl] 重采样,重复取样
resampling [re'sɑ:mplɪŋ] 再取样,重采样,重取样
reweighting [ ] [网络] 权;重权
startup ['stɑ:tʌp] n. 启动;开办
startups [ ] n. 创业(startup的复数);开办
youtube ['ju:tju:b] n. 视频网站(可以让用户免费上传、观赏、分享视频短片的热门视频共享网站)
词组
a clip [ ] [网络] 主叫线路识别显示
a hack [ ] [网络] 网络攻击
a tech [ ] [网络] 泓晟;科技;艾德贴标机
Amazon Mechanical Turk [ ] [网络] 亚马逊土耳其机器人;亚马逊人端运算平台;亚马逊的土耳其机器人网站
audio clip [ ] [网络] 音频剪辑;音频复制文件;音频剪切
audio data [ ] [网络] 语音数据;音频数据;声音数据
audio source [ ] [网络] 音频源;音源;音频文件
batch number [ ] un. 组号 [网络] 批号;批次号;生产批号
binary classification [ ] 二元分类
blog post [ ] [网络] 博客文章;博客帖子;部落格文章
chat with [ ] [网络] 和…聊天;与…闲聊;与某人聊天
classification problem [ ] 分类问题
clip out [ ] vt.剪下来
clip to [ ] vt.夹在...上
data synthesis [ ] 数据合成
detection problem [ ] 探测问题
detection system [ ] un. 检测系统;探测系统 [网络] 侦测系统;检测系统试剂;检测体系
duration of [ ] 持续时间
good algorithm [ ] un. 优质算法
halfway through [ ] [网络] 过半了;在某事过了一半时;没有译准
homework problem [ ] [网络] 作业性问题;家庭作业问题
homework problems [ ] [网络] 功课问题
horizontal axis [ˌhɔriˈzɔntəl ˈæksis] un. 横轴;水平轴 [网络] 横坐标;水平轴线;横轴线
in parallel [in ˈpærəlel] adv. 并联;并行 [网络] 并行的;平行;并联的
Isaac Asimov [ ] [网络] 阿西莫夫;艾西莫夫;艾萨克·阿西莫夫
learning algorithm [ ] [网络] 学习演算法;学习算法;学习机制
light bulb [lait bʌlb] n. 灯泡 [网络] 电灯泡;右下灯泡;白炽灯
Mechanical Turk [ ] 土耳其机器人
minus one [ ] [网络] 桃花源;幸福意外;谢谢你捧场
neural net [ ] adj. 神经网络的 [网络] 神经网络法;类神经网分析;神经法则
neural network [ˈnjuərəl ˈnetwə:k] n. 神经网络 [网络] 类神经网路;类神经网络;神经元网络
neural network architecture [ ] 《英汉医学词典》neural network architecture 神经网络构筑学
neural networks [ ] na. 【计】模拟脑神经元网络 [网络] 神经网络;类神经网路;神经网络系统
noisy data [ˈnɔizi ˈdeitə] [医]噪声数据
on the web [ ] [网络] 在互连网上;在互联网上;网站内容
out of whack [aut ɔv hwæk] na. 〈非正式〉坏了;不对头;(身体)不舒服;(机械)运转不正常 [网络] 紊乱;出故障;运行不正常
plus minus [ ] un. 正负 [网络] 加减;加减符;加减乐团
temporal data [ˈtempərəl ˈdeitə] [网络] 暂态数据;时态数据;时序资料
the algorithm [ ] [网络] 算法
the duration [ ] [网络] 持续时间;的期限;时值
the horizontal [ ] [网络] 中脑水平切面
the marketplace [ ] [网络] 市集;市场购物中心;市场价值
the vertical [ ] [网络] 垂直性
the web [ ] [网络] 网;网页;网络
to download [ ] 下载
ton of [ ] 大量,许多
trigger word [ˈtriɡə wə:d] [网络] 触发字;触发语;触发者
variance in [ ] ...的变化
vertical axis [ˈvə:tikəl ˈæksis] na. 【数】(直)立轴 [网络] 垂直轴;纵轴;竖轴
WEB BROWSER [web ˈbrauzə] n. Web浏览器 [网络] 网络浏览器;网页浏览器;网路浏览器
word detection [ ] [计] 字检测
zero detection [ ] 检“0”
zero zero [ˈziərəu ˈziərəu] 零
惯用语
all right
and then
and um
and you know
i don't know
i think
let's say
robert turn on
so um
train noise
turn on
turned off
you know
单词释义末尾数字为词频顺序
zk/中考 gk/中考 ky/考研 cet4/四级 cet6/六级 ielts/雅思 toefl/托福 gre/GRE
* 词汇量测试建议用 testyourvocab.com
