20 min read 🤓
may 28, 2019
Dedicated to Jochem Hartz (1961-2021) — true artist, true friend.
« ik kan mij vrij
van tijd tot tijd bewegen »
ETUDE
Now available on Bandcamp! Each of the album's 52 tracks—52 weaks makes a yeer—combines an etude, i.e. a run of a 1985 BASIC program with that name, 'poking' Commodore 64's classic SID (sound interface device) chip —for these particular recordings emulated using Virtual C64 on a MacBook Pro— with a rendition —as a [MIDI] piano track— of the hearing of that etude by an artificial listener, in casu: ScoreCloud Express, the smartphone app version of an 'intelligent', state of the art music notation software called ScoreCloud, developed by Doremir, a Swedish music intelligence company.
The download of the digital album includes a 71 page e-book with an introductory text, the full BASIC program of ETUDE
, and the 52 scores that show the artificial listener's musical understanding and melodic re-cognition of the 'etudes'.
Even though I am not quite sure that indeed I do [remember rightly], I got my first personal computer some day in 1985, if I remember rightly. (added June 22 2023 - I bought it from Wijnand de Groot, friend, sound man, very early digital recording adopter & wizz, and the violin player in Presse Papier.)
It was a Commodore 64.
ETUDE
was the name that I gave to a BASIC sound program that I made for that little machine, making use of the SID (Sound Interface Device) chip that came with it, which was way ahead of its time. I wrote it one evening not long after I had bought the Commodore, second hand, from a friend. ETUDE
had little serious intent, it was sort of like a private joke.
Every run of ETUDE
has a fixed, strictly circumscribed format: it consists in 51 triads played by SID's three voices; within each of the 51 triads each voice retains the same enveloppe (set by a pseudo-randomly chosen triplet of values at the start of the run); all of the in total 153 frequencies are pseudo-randomly picked within a (slightly wavering) 40 note harmonic scale, that ranges, in +/- 15.5 Hz steps, from about a D#3 (∼155 Hz) to somewhere halfway between a F#5 and a G5 (∼760 Hz); the duration of each of the first 50 triads is chosen pseudo-randomly between 10 and 1089 (∼milliseconds), the 51st triad will always last 3000 (∼milliseconds) [a run in practice turns out to last a little over a minute]; finally, each of 153 tones is assigned one of three waveforms (triangle, sawtooth, white noise), also pseudo-randomly, but weighted so as to make 'triangle' dominant.
Within this rigid framework, all other choices are left to the Commodore's pseudo-random RND(0)
routine, which of course only in appearance is lawless (random). The numbers it spits out are merely, well-hidden and deviously concealed, reflections of the C64 processor's binary lawfulness, reflections that are not easy to literally reproduce.
Every run of ETUDE
thus composes an etude, a pseudo-lawless lawful tune.
Even though I am not quite sure that indeed I do [remember rightly], I bought my Commodore 64 some day in 1985, if I remember rightly.
I tried to find confirmation for this. Words written on the yellowing pages of a notebook or a diary from those days, something other than and independent from my blood-and-flesh memory. Because it might also have been '84; or '86... But anyway, it was, it certainly was, some time around back then, in the middle of the greyish 1980s.
The popular Sinclair ZX-Spectrum and the even more popular Commodore 64 were the first computers —still mouse-free, with nothing much of a 'user interface' and basically just running one program at a time— that were cute, versatile and affordable enough to slip into generations of then teens and twens' bedrooms and studies, into the recording studio's of musicians, designers and visual artists. They also became wet dream and obsession of small armies of hackers, geeks and nerds. [For Dutch readers: there's more about this in my recent 'Hoofd Stuk' in Gonzo (Circus) #151: "Beep," zegt de muis. Het ongrijpbare lichte van 8-bit en lager ]
I got mine from a friend. It was a previously owned C64. I was poor as a church mouse back then and the things actually were not all that cheap.
Computers and compu-tings an sich, though, were far from new to me. In 1975/1976, having just arrived in Amsterdam, I learned to program in FORTRAN, on the Amsterdam university's main frame machine that then was housed in the Roetersstraat ...I think, for again, it's just what my memory tells me. I am pretty sure though that it was there that I handed over my packs of punch cards with programs and data to run.
A bit later, in the years 1983-1985, I spent a lot of time with the DEC (Digital Equipment Corporation) PDP (Programmed Data Processor) computers at the Institute for Sonology, then an institution that was part of Utrecht University and picturesquely located on the Plompetorengracht, where, as a student, I eagerly pursued a whole range of computer related projects, like attempts at writing short pieces in Harry Partch's 43-tone scale using Werner Kaegi's VOSIM (VOice SIMulation) generators and his MIDIM (MInimum DescrIption of Music)( * ) system.
But most time I spent there with Gottfried Michael Koenig's PROJECT (serial) composition programs, PR1 and PR2, which I found utterly fascinating and which neatly fitted my then obstinate reading and studying on/of all thinkable serial approaches to musical composition.
Series and sequences, from formal(l)awfully regular via quirk-i(r)regular to utter-stocha(s)(o)tic-al and lawless-ly random, are at the heart of all that we deem 'music' and 'musical', of all that is. So from back then to right now, how could they have ever ceased to fascinate and amaze me? Theirs is mystery's deepness only equaled by time.
ETUDE
is the name that I gave to the one and only BASIC sound program that I ever made for the Commodore 64, using the sound possibilities of the SID (Sound Interface Device) chip that came with it. I wrote it one evening not long after I had bought the little machine, with little serious intent, sort of like a private joke. A binnenpretje we call that in Dutch. For starters, I had the word ETUDE
appear in the middle of the screen using the primitive graphics of the little computer, by 'poking' little coloured squares in the right places. The Commodore's screen (rendered on the small portable black and white television set that I used as a monitor) was defined as a 40 x 25 grid, with each of its 1000 cells capable of displaying one of the available characters, defined in the cell's dedicated memory location, starting top left at address 1024
and continuing to bottom right location 2023
. Each cell moreover could display in one among 16 possible colours, set by writing a number between 0 and 15 in one among a 1000 more memory locations, from top-left at location 55296
to bottom right at 56295
.
Displaying the word took 50 screen-cell-sized squares, and 2 half-squares to keep the D
from looking like an O
, and 52 numbers to set the cells' colours.
This here picture is very recent, made while running the ETUDE
program on the VirtualC64 emulator for MacOS, on a MacBook Pro. Though at the time of writing the program I never saw them, as said, there were colours involved. Among the 16 available three were picked randomly at the start of a run of the program; two of them were assigned —again in a random fashion— to the squares; the third one would only sometimes pop up, as the colour of one or both of the two half-squares that added the finishing touch to the D
. On the portable black and white television-set that I used as a monitor (the brand was Aristona, I think), what I saw would have been more like (a snowy version of) this:
7 START LOOP HERE
Once the word ETUDE
has been drawn on the screen, I again invoke the Commodore's (very)-pseudo random number routine RND(0)
, many times.
First, to generate 81 integers between 80 and 199, which are stored in an array called AD
(for ('Attack-Decay[-Sustain-Release]').
Second, to generate 154 integers between 10 and 49, which are kept in an array called H
(for 'Height').
Next, an array of size 16 called W
(for 'Wave') is filled, not with pseudo randomly generated values this time, but 11 of its slots get the integer 17
, 3 slots the integer 33
and 2 of them the number 129
.
After an initial (pseudo random) choice of three times two values from the AD
array, the H
- and W
-arrays are then used for the —as always pseudo-aleatory— generation of a series of 51 successive triads. Each triad is built from the three voices (tone generators) of the Commodore's SID-chip, with differing (pseudo randomly picked) values from the H
-array for their frequencies, and differing wave forms, (pseudo randomly) picked from the H
-array, but always with the initially chosen ADSR
values.
Each of the three generators has its corresponding set of SID memory addresses (registers), each address being an 8-bit byte. The (decimal) addresses are at 54272 to 54278 for voice 1, 54279 to 54285 for voice 2 and 54286 to 55292 for voice 3.
[ Envelope ] So, before generating and playing one after the other the 51 triads, the program selects RND(0)
-randomly 6 numbers, each between 80 and 199, from the AD
array, setting the 'Attack-Decay' and 'Sustain-Release' byte values for each of the three voices.
The following then is repeated 51 times:
[ Frequency ] From the H
array, 6 entries are RND(0)
-randomly selected, this time resulting in values between 10 and 49. Each of the voices uses 2 bytes that together form a 16-bit number that linearly controls the voice's frequency, enabling oscillator frequency values between 0 and 216; that's a pitch frequency range from 0 to just a little under 4 kHz.
[ Waveform ] Then for each of the three voices a waveform is selected, RND(0)
-randomly, from the W
array, containing 11 number 17's, 3 number 33's and 2 number 129's. This 'uneven' distribution intends (and manages) to favour the choice of number 17, which, written in the appropriate register, sets a voice's waveform to triangular. Number 33 is sawtooth. And 129 ... that's white noise.
As all integers used to set the voice frequencies are between 10 and 49, the oscillator frequencies that sound in ETUDE
span a range from 10 × 28 + 10 = 2570 ≈ 155 Hz (that's about a D#3) and 49 ×28 + 49 = 12593 ≈ 760 Hz (i.e. about halfway between a F# and a G5). ( ** )
The following table shows the high bytes / low bytes for the lowest (10|10) and the highest (49ï49) possible oscillator frequencies for each of the three voices in ETUDE
's triads:
0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 |
0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 |
hi byte | lo byte |
As the principal half of the frequency value is the 'high' half byte, the piece uses a 40 step arithmetic frequency scale, starting at 155 Hz and stepping up to 760 Hz (i.e. ranging over a little more than 2 octaves), with a step width (common difference) of about 15,5 Hz. So, we basically have a harmonic series with base frequency 15.5 Hz, transposed over ten times that base frequency. Each of the scale's frequencies ('notes') then possibly is very slightly 'bent' upwards by something between 0.6 and 3 Hz. If indeed the smallest pitch variation that the average human ear can detect, relative to some reference frequency, is around 0.0035 (just a little over 6 cents), these 'bendings' would in principle be discernible to our ears. ( *** ) In practice, however, these tiny variations will have little or no influence on the perceived pitch (that of the 'high' half byte). E.g., Sven Ahlbäck in his 1994 thesis Melody Beyond Notes. A study of musical cognition (the music cognition model that seems to be underlying the music notation software Scorecloud) regards 'pitch differences below 30 cents [...] structurally insignificant'.
[ Duration ] Having set envelope, frequency and waveform for each of the three voices, the play time of the triad in Etude is then determined by having the program idly 'loop', i.e. stepping back and forth a certain number of times, that number once more RND(0)
-randomly selected, between 10 and 1809. One may think of these durations as being in milliseconds.
Each run of ETUDE
then repeats this triad building and playing 51 times, in between the (pseudo randomly) one by one erasing of the 52 little coloured squares that form the word on the screen. The final (51st) triad is given a duration of 3000, after which the one remaining little coloured block disappears, and the screen has become blank again.
And then ETUDE
starts all over again.
It is an infinite loop:
667 GO TO START
Here is an example of the values obtained for the envelope and those for frequencies, waveform and duration of the first three of the 51 triads of some run of the program (the full list, of course, will be no less than that run's detailed score... )
voice 1 | voice 2 | voice 3 |
|
AD | 116 | 99 | 100 |
SR | 156 | 140 | 120 |
lo | 12 | 15 | 39 |
hi | 15 | 44 | 16 |
wave | 17 | 17 | 33 |
duration | 980 | ||
lo | 26 | 22 | 45 |
hi | 37 | 43 | 48 |
wave | 129 | 17 | 17 |
duration | 825 | ||
lo | 20 | 34 | 17 |
hi | 18 | 24 | 10 |
wave | 33 | 17 | 33 |
duration | 783 | ||
et cetera... |
Each of the triads thus is determined by 10 + 6 numbers, the six being the ADSR
-values that hold for all 51 of them. That makes the sounding of each run of ETUDE
fully determined by a list of 6+(51 × 10) = 516 (DXVI
) integers.
As said, ETUDE
came to me sort of as binnenpretje that summarised, tongue-in-cheek, in a mere handful of straightforward BASIC lines executed by un petit ordinateur pour le grand publique, decades of tough work and thinking in avant-garde electronic music. A joke indeed, played by the unstoppable & lightning fast digital [r]evolution on many that were researching and creating in the field (often making whole chunks of their hard work and thinking obsolete way before its fruits could be properly reaped); a summary à la hache, of course.
It was a binnenpretje with a serious undertone that lingered. In the 30 ∼ 35 years that came after, I continued to come back to ETUDE
, quite regularly.
"Tonnen wolkendek trokken in iedere denkbare richting over de stad, steeds weer ging de zon op en onder en duizenden dezelfde malen schoven aan de straatkant de gordijnen met haar mee open en weer dicht."
It played, for example, a major part in a short movie called COMMODORE
, that with some friends and two actors in may 1999 we shot in a squat in the port of Amsterdam, but that until this very day has remained but a box of taped rushes, and never got edited into a show-able form.
ETUDE
's sound vaguely reminds me of a barrel organ that, kind, went off the edge, like the one in ookoi's Raudio Gaga clip from 2009. (↓)
It is one way to think of the little program: as a generator of a multi-galactic sized library of barrel organ books, all in this one same format —the program being the generic barrel organ book of that given format, the meta-book so to say— and each of them a next one in an unending (with respect to all of our human measures) stream of sequences of fifty-one out-of-tune notes; melodies set in wavering pitch and played in a mostly pretty hard to quantify rhythm.
There is something obviously Library of Babel-ish about this, also one of my ongoing fascinations. However, there's an essential difference between the pseudo-random generation of letter strings, unweighted or weighted to reflect the relative frequencies of letters as they occur in some target language, as in Jonathan Basile's online version of Borges's Babel library or in the ZX-Spectrum screenshot below, and the 'lawless' tunes that are being generated by ETUDE
.
None among such (pseudo-)randomly generated sets of letter strings will strike a reader as having textual potential (I try to be very careful in using the word 'meaning', but we might indeed say it here: meaning), other than that of the couple of isolated, short words that almost always will pop up (on the ZX-Spectrum screen ↓ we read sea, fan, and...). There will be exceptions, of course. But these will occur with a probability somewhat alike the chance of hitting upon one special, marked, grain hidden at random in a thousand miles stretch of sand.
On the other hand, all of ETUDE
's outputs have musical potential. More so, I actually think of every single one of them as a music that is, or, at least, as a music that will be. They are possible musics. ( **** )
- Music is realised by a listener who listens
- Music moulds (states of) mind
- Musical form is realised by a listener who repeatedly listens.
- To understand music you (just) need to walk with her, again and again: time is musical understanding's principal object, it is fully grasped by re-hearing.
The ETUDE
album now available at Bandcamp's is based on the sound recordings of 52 runs of the program in the Commodore emulator Virtual C64, made all on the same evening in April of this year.
I then used a more recent software to be ETUDE
's audience: I made an artificial, a virtual, hearer listen to all 52 etudes. What is it that such an artificial musical 'intelligence' will hear, when it is confronted with ETUDE
's pretty much lawless micro-tonal electro-tunes? For this I used ScoreCloud, mainly because I had tried the smartphone app version, (ScoreCloud Express) of this music notation software before, and it was already installed on my iPhone.
So, ScoreCloud Express listened with the iPhone's ears (microphone) to each of the 52 etudes —sometimes more than once— played back to it via the built-in loudspeakers of my MacBook. And each time it listened, it reported on its musical experience and understanding —its melodic re-cognition— in the form of a score. In which ETUDE
's wavering pitches are being forced onto the twelve notes of our equally tempered scale, and the software somehow needed to come to terms with the even more 'wavering' durations of ETUDE
's 51 triads.
I find the results utterly fascinating.
It is why on the album I have each of the etudes followed by a MIDI piano rendition of what the smart music notation software re-cognized.
This artificial listener, ScoreCloud Express, at the moment of this writing to me is nothing more than a black box. I input a sound recording, and out comes a score. I did try to find out more about what makes it tick, about what it is that actually happens inside the box. It seems that what goes in in there, is based upon a melody cognition model developed by Sven Ahlbäck, one of the founders of Doremir Music Research AB, the Swedish 'music intelligence' company behind ScoreCloud, in his 1995 doctoral thesis, Melody Beyond Notes. I did spent some time looking through the thesis, but I guess that, given the almost twenty five years that have passed since Sven's graduation, this is not the best and hardly an efficient way to learn how ScoreCloud 'understands' —re-cognizes— a sound input.
ETUDE
actually appears to give the sort of input that, at different times, with different tries of the very same recording, is heard differently by the artificial listener. A repeat of the playback to ScoreCloud of the same etude, results in quite different re-cognitions, it produces quite different scores.
But is it not so that also I from to time will hear the same track in a 'different' way?
"The cognitive process continues also after the sound of the melody has stopped" ( ***** )
The very lo-fi clip below is the registration of a run of ETUDE
, made some twenty years ago on a PowerMac, using another Commodore 64 emulator, called Power64. It is called Etude 004, (probably) because it was the fourth in a series of ETUDE
audio/video recordings.
The clip is followed by three of the scores that resulted from SoundCloud's listening to the clip's sound track.
How many different ones, would the software, eventually, come up with? - And what, here, is 'different'? Should not something be common to all of them? Can a detailed analysis of the three little scores below bring to the fore that they indeed have a common source? All questions that for now I can only try to guess (some of) the answers to... Shall I ask Doremir...? 🤔
As a bonus, for those of you who in some way or other arrived at the very bottom of this long post, here's a, say, 'free' mix, in which a time-stretch of Etude 004 combines with a number of its different Scorecloud-interpretations 🤗 ...
[ added June 13th, 2019 ]
I did ask (info@)Doremir, for information and whatever, but I did not get a reply. Actually, it's quite possible that in the algorithm(s) used also something is at work like in Onset and Frames, a pre-trained neural network to convert raw audio to MIDI. There is actually an online (javascript) version available, PianoScribe, that works in your browser. Better use Firefox, not Safari, if you're on a Mac. And try short pieces (seconds). Longer ones (I tried a full one minute and something etude) froze my MacBook Pro. For a fifteen seconds fragment, the PianoScribe did a pretty good job. It came up with a fine rendition of the timing and of the etude's triads. It was deterministic, gave me the same result when I ran it again with the same audio. But then, the result in its relative and, surtout repeatable, accuracy, makes for a music that is far, far less interesting than those resulting from ScoreCloud's 'hearings'.
notes __ ::
(*) Because of its acronym, Werner Kaegi's MIDIM is easily confounded with MIDI (Musical Instrument Digital Interface), but it has absolutely nothing to do with this technical standard and technology that was introduced in the early 1980's, coinciding with the advent of personal home computers, affordable samplers and digital synthesizers, which boosted the commercialisation of digital electronic music hardware and software. The ensuing efforts and innovations powered by the financial powers of a tech industry eager to profit from a new and quickly expanding market, in the course of the 1980s made much of the painstaking and sometimes, admittedly, rather cumbersome custom digital music synthesis systems that were being developed for research and artistic production in the early academic institutions for electronic and computer music, like the Utrecht Institute of Sonology (the even more prestigious Parisian IRCAM is another example), obsolete. [
^ ]
(**) The tables in the NMOS SID circuitry's technical information sheets however use a factor 0.0596 ('the standard 1.0 Mhz system clock divided by 224'), while the Commodore C64 programmer's guide relates the SID 6581 chip's oscillator frequency Fn to pitch frequency Fout via a factor 0.06097 (Fout = Fn × 0.06097); this corresponds to a 1.023 Mhz system clock, which in fact appears to have been a C64 NTSC machine's system clock frequency as opposed to the 0.985 Mhz of the C64 PAL system clock. So, well... the 1 MHZ is right then there in between 😊 [
^ ]
(***) cf. http://www.cochlea.eu/en/sound/psychoacoustics/pitch, but would be nice to have a more reliable reference. [
^ ]
(****) As will (arguably for some, though for me undoubtably) be any timeline, any chain of sounds, when I make it the focus of a listening that no longer is primarily referential and semantic, i.e. a listening that is not an outward hearing, aimed at interpreting/decoding sounds as signs of objects and events in the external world. A musical hearing of strings of sound implies syntactical listening. It is necessarily an inward hearing, and —as much as possible— non-referential (de-tached). A sonic event, the now sounding, is interpreted as primarily referring to other sonic events that occur(red) in its immediate temporal vicinity (the perceptual present; think of a window/frame of dynamically varying but always small size that is moving along with the now over the timeline.) [ Outward versus inward —concatenationistic— referring / semantix versus syntax ; all hearing has both, it is two-dimensional, with an outward component, and an inward component; they will differ in weight, but neither the in-weight part nor the out-weight part can ever become really zero; seems possible though to bring them (arbitrarily?) near to nil. ] [
^ ]
(*****) Slightly adapted quote from Sven Ahlbäck's Melody Beyond Notes, page 40 [
^ ]
tags: commodore, SID, chiptunes, Scorecloud, strings, random
# .488.