Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a student of classical I'd like to point out what I believe to be the main in-humanness in the generated music that makes it stray from "real" classical.

Some have cried violation of counterpoint, but I don't think that is such an issue here. CP is mostly about how notes are linked together on a small scale, and this is rather the network's strong point: it seems to concentrate on the associativity between notes and between chords, and any given moment seems to be composed with a vision that only extends to a few bars (or even just a few beats) around that moment.

The main problem is therefore large-scale structure. For one, recurrence of melodies (at key moments and transpositions) is crucial to creating the emotional value of classical music. None of this appears here.

And secondly, possibly the greatest shortfall of the present neural network (I'm ignoring performance, of course) is the harmonic structure. Classical music, let's say later than medieval and earlier than late-romantic in style, generally has the harmonic structure of a recursive cadence. Harmonic cadences are what give emotional power to harmony, but this NN is painfully incapable of creating any.

That being said, I don't think this problem is inherent to the approach of creating music with NNs. Right now it sounds like what you'd get with a well-crafted Markov chain, but NNs can go beyond this, and this article is exactly the kind of thing that will instigate this evolution.



I agree, I kind of feel like there should be a couple different layers of generation.

A "planning" layer that lays out the song plan (ABACABBA, etc.)

A composing layer that fills in those sections. And maybe even generates some slight differences between the same-named sections for variety.

A performance layer that plays it back with a simulation of human performance metrics (slight jitter to note placement, emotive crescendos, suggestive variations in note-length, etc.).


Maybe this kind of thing can also be learned by a secondary NN. It just needs to be trained with data collected over large scale sections of the example music.

But this NN doesn't solve the greatest problem in Classical music: that only 3% of people take the time to appreciate it.


Let's design a neural network to appreciate classical music, then spawn a few billion instances. Greatest problem: solved.


http://xkcd.com/1546/ comes to mind.


Why do you think that is the greatest problem? If a concern for audience numbers, what's the goal and why? In classical music, my experience is that there are many more highly accomplished practioners than can ever be supported by audience demand but that's only a problem if there's an argument for a set goal for audience figures. There are of course thousands of pursuits which demand time for their appreciation.


Greatly more than 3% of people appreciate classical music. I've been to countless sold out classical symphonies. It may not be their favorite genre, but people pay attention and enjoy it. It's only 3-7% of new music sales in stores, which is all that media wants you to think about.


For pop-rock and other styles of mainstream music there are massive libraries of chord progressions that analyse all the "hits" structure (or big picture). Those would be really helpful with the "planning" layer... I.e:

http://amitkohli.com/wp-content/uploads/2015/02/InteractiveC...

I bet you can get results much more accurate feeding all those progressions as one of the reference layers.


There is a series of works of Francoise Pachet at sony research labs in Paris and his collaborators that uses hmm with constraints to obtain something that preserves long-range structures. I don't have the links at the moment, but he got impressive results.


Will "NN Teacher" be a job title soon? In the present situation we have an NN who learned empirically and who needs a bit of structure: "Here are the sound plans you need to study, here are the melodies, here are the styles and trends".


You mention markov chains and as a student of English literature I'd like to point out what I believe to be the main in-humanness in generated prose that makes it stray from "real" prose.

Some have mentioned pointless sentences, but I don't think that is such an issue for modern literature. Modern prose mostly has to do with what words come after each other on a small scale, such as a single phrase, or the associativity between words and between sentences, and at any given moment most words are only used for a few sentences, then replaced by other words.

The main problem is therefore large-scale structure, chapter to chapter. For one, the recurrence of a theme throughout several volumes of work is crucial to creating its emotional value and cohesiveness. None of which appears in Markov Chains.

And secondly, possibly the greatest shortfall of Markov Chains (ignoring performance) is in plot. They're just not very imaginative. Prose, let's say later than ancient epics, has interesting, recursive plot devices. Plot is what gives emotional power to sentences, and most markov chains are painfully incapable of creating any.

That being said, I don't think this problem is inherent to the problem of writing prose using Markov Chains. Right now it sounds like what you'd get with a crowd-sourced writing prompt, but Markov Chains can go beyond this - and literary criticism is what it will take to bring Markov Chains up to the standard of popular writers like Stephen King.

It's pretty clear that in the near future, we will laugh at the idea of reading something that someone hand-typed from their own thoughts, instead of simply reading the output of one-dimensional random walk. I hope if I have grandchildren who are as interested in modern literature as I am, their degree will be a bachelor of science, with the only work they have to do being to quantify and remove any remaining guesswork from the science of computer-generated writing. Even today, it is hard to understand why anyone still reads writing written by someone. It's as quaint as a telegram.


The main path going forward in improving Markov chains is expanding the individual "unit" that are selected by these chains. Current Markov chains only randomly select words to string together into semi-coherent thoughts, but what if they instead randomly select paragraphs, or even entire chapters? This will promote large-scale structure within the book, and ensure the plot will always be varied and engaging.

I hope that your grandchildren's Bachelor of Science would train them in curating brilliant works of art that can then be feed into the Markov chain and perform the one-dimensional random walk.


I was actually making fun of this way of producing "art" (such as music.) Obviously since the algorithm has no human 'state of mind' or emotion, it cannot include any in the art whatsoever. My comment was completely satirical.


My comment was satirical too actually. The whole idea of thinking computers have created something "new and original" just by shoving together bits and pieces of human-produced works of art seems somewhat ridiculous. It's not entirely ridiculous, as the computer did do some manual labor of piecing together the words, but ultimately, you are relying on a corpus of pre-existing work and then reassembling it together. Any emotion (or indeed, most of anything good) comes from the corpus, not from the bot analyzing and imitating the corpus.


Precisely. Compare the example in the post with this: https://www.youtube.com/watch?v=ZevgEUVeZ9Y

Notice the overarching theme, the experience that the music "shows you", the arc it leads you through. In comparison, the generated music is but a chain of pleasant and nice-sounding moments, with no overarching emotional arc uniting them.

Now is that the exclusive domain of humans? Will NNs eventually write music that passes for a human composition to a skilled human jury (a musical Turing test)?


Yes. It sounds good at the scale of a few seconds, but then it's clear there's no higher level structure. That's about what you'd expect from a recurrent ANN. Nice demo, though.


> but NNs can go beyond this, and this article is exactly the kind of thing that will instigate this evolution.

Thought experiment: one day it is not uncommon for some deeply moving, emotional works to be composed entirely by computers (though performed by humans). How will we, as a society, react to that? Will be reject it as being 'unauthentic', or use it as an opportunity for introspection?

It would even probably not be unreasonable to think that these NNs will be given names, and different NNs will be characteristically different, so then you might see a Wikipedia page or an album from a specific NN in much the same way we'd see a human composer today. How weird would that be?

Edit: or, even, NNs trained on specific composers (and the music they'd have heard), to try and create 'new' works from existing long-dead composers, or even just to complete famous half-finished works. How blasphemous would that be? But for a sufficiently competent NN, I can imagine the output might be very interesting.

Edit 2: (apologies, but this is really out there) I have also wondered how much information is sufficient to 'recreate state'–that is to create an 'AI' that passably mimics a specific real person (kinda like a Turing Test), and in that sense creates a pseudo-immortality.


Arguably, what makes a worthy creative different from someone just spewing out content is their ability to filter output by applying their taste. There’s the never-ending loop where you create and evaluate and necessarily throw away.

A person with developed taste who ‘filters’ someone else’s creation, recognizes great work, helps the creator shape their output, is thus a part of the above loop, and depending on specifics may be of importance comparable to or greater than one of the creator.

The balance between contributions of different entities that cause a work to be created and known, or a style to be formed, is always fluid (among singers, performers, front-personas, authors, producers, mentors, labels, etc.). Replace one of the components with a computer and overall picture doesn’t change much.

If a piece of music was generated by software, then whoever set that software up and filtered its output is the creator. That may include the programmer, those who were using the software, other people who directly influenced the creation in significant way.

If a person using some generative algorithm doesn’t feel like their input was substantial enough, they might use a pseudonym. Attributing music to a computer explicitly would be purely marketing move and it doesn’t change the fact that author is always conscious being(s), which, unless we’re in singularity, a computer isn’t.


> Attributing music to a computer explicitly would be purely marketing move and it doesn’t change the fact that author is always conscious being(s), which, unless we’re in singularity, a computer isn’t.

I don't know. I'm sympathetic to this view (and for the record, I wasn't going anywhere near a hard AI/singularity argument), but on the other hand, I think after enough iterations you won't be able to find where that human input actually comes in. When we finally get a NN that produces something actually great, will we be able to point to a specific line of code, or a specific input, or a specific programmer whose taste resulted in that? We already struggle to understand the inner workings of neural nets.

So you can argue the 'taste' step comes into the selection process. Somebody has to sift through the output of the NN to choose what's good and what isn't. But what if that's automated? A different output to a different member of the population, so then the NN can test itself, and it decides what is worthy of output on a larger scale? Then you can't point to any one individual either.

So it's a semantic point. I think you're right, fundamentally. But I think we can very quickly reach a point where we have to travel through a very long rabbit hole to get back to that key human influence.


>Thought experiment: one day it is not uncommon for some deeply moving, emotional works to be composed entirely by computers (though performed by humans). How will we, as a society, react to that? Will be reject it as being 'unauthentic', or use it as an opportunity for introspection?

It would even probably not be unreasonable to think that these NNs will be given names, and different NNs will be characteristically different, so then you might see a Wikipedia page or an album from a specific NN in much the same way we'd see a human composer today. How weird would that be?

I know you're talking about Neural Networks. But I'd like to point you to Vocaloids. As far as names, personalities, and music go - it's a good match for what you're getting at. Hell - one of them is even famous for having concerts! [0]

Hatsune Miku, Rin Kagamine, Luka Megurine, Neru Akita

[0] https://www.youtube.com/watch?v=dhYaX01NOfA


I really love your critique, because it seems consonant with the work my lab does. We work with EM neuron images, we have a DNN trained on image patches that are quite small as the training cost is quite high. If I recall correctly, it was trained for months using 7x7x7 patches on several 100^3 voxel volumes. The output of the network was fractured, and at the time it was the best we could do. However, humans, because we can see the bigger picture, bigger than 7x7 patches on 2D image displays at a time, we can resolve the splits in the image segmentation that the AI is bluntly unable to. It's worth noting that my lab uses convolutional neural networks (CNNs) not recurrent neural networks (RNNs).

It seems to me that this is a technical problem of being able to train and run a large enough network to approach human abilities in pattern recognition. Sometimes this is easier than others.

However, a disclaimer, though I hang out with machine learning people, I'm not yet one myself.


The author could change the generation of piece by generating a short theme, then fixing that theme at particular points in the note sequence and then generate the gaps, and then repeat the thing as recursively as he wants.

This would provide some recurrence of melodies and would sound infinitely better. Problem with these kinds of models is the window based structure without the notion of the general theme. Theme should then be added by the human, and everything else can be filled using the model.


Yes, in short the music doesn't have sense - there is no story in it, or some message. But anyway, that's impressive that the network has generated this music.


Why can't it just be the story of a random walk through classical-music-space?


It can of course. The same way I can just produce random noise and say it is music (avant-garde), and those who don't like just understand nothing in art.


Or more illustrative analogy: if it was generating text instead of music, and produced a sequence of words - some unfinished sentences, or maybe even complete sentences, maybe even sharing subjects (names) sometimes between sentences. But the whole text does not deliver any story. But you say: "maybe it's a story, where all these words follow each other". Well, maybe, for someone. Maybe it is anvant-garde poetry.

But for me this opus has no real meaning.


Sort of like 'Finnegan's Wake' :)


Finnegan's Wake is packed-full of self-referential meta-data and has a highly coherent structure.

In this sense, it's relevant to the discussion because it may appear to have been created by Markov chains whereas in fact it's intelligently molded and makes (more and more interesting) sense from the many perspectives you start to have as you spend your life with it.

Excellent project here, which even from reading the first page you'll have an a-ha moment: http://www.wakeinprogress.com/2010/10/introduction-to-charac...


Because for it to be that you'd need a workably good - if not fully complete - definition of classical-music-space. And this clearly doesn't have that.


Because no message is not a message. Not even a medium.


Well, this was trained on select pieces from 25 composers. Maybe it would be better trained on a single artist's catalog or something like genre. Pandora could do interesting things with their Music Genome Project.

I would say the next step is analyzing structure, but this makes writing music stupid easy. Just wait for something interesting and "everything is a remix" it.


> That being said, I don't think this problem is inherent to the approach of creating music with NNs. Right now it sounds like what you'd get with a well-crafted Markov chain, but NNs can go beyond this, and this article is exactly the kind of thing that will instigate this evolution.

That's a very thoughtful conclusion.


Since you are a classical music student, I'd like to ask: did you think the sample piece on the page was truly classical?

I'm no expert, but to my ear it sounded more like a later Baroque period piece, or very early Classical at most.


In your opinion, is it possible to learn classical music theory without being a musician? Also, is it something that can be learned by reading and listening, or does it require a teacher to be present?


I think that the doctrine "do, and understand" applies particularly well to classical music theory and composition, though I'm very biased in that regard. My post above mostly concerns the "theory of harmony", and my recommended way of learning that is to grab a textbook on the subject (one with exercises) and do the composition exercises. No need to play any instrument, as a geek you can use any software that features a good pianoroll. And here's the most opinionated part: teacher useful but not strictly necessary.


As someone who studied music but never got the commitment and discipline to play it well, I can tell you that learning music theory can be easier than playing it - at least for people like me.

This is strange because I can paint and sculpt very well, but playing is such a struggle for me... I really envy people that can play or sing naturally.


It sounds kind of like a mashup of Philip glass and Mozart


hacker




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: