autoedit
audio
tools
ai
music
]
autoedit is the mother of a suite of ‘auto’ tools for music creation. it currently includes autoedit (make automatic edits of music audio), autocover (create a cover from the track audio), automaster (create a master from the track using a the matchering technique with a reference file). autovoice (automatically beat-align voice snippets over a track) and autofold (fold audio tracks over themselves and each other) are in experimental status.
autoedit is a Python audio tool that creates automagic edits of too-long or too-short audio files. Randomly, no repeat. Generate and test. Good news is, they always sound good. How? With big use of small AI. Now available as web app, no install required.
tl;dr autoedit
autoedit is a tool that creates an edit from your input file. Edit means, the input file is broken into chunks and then some chunks are chosen and arranged in some order to make the output file.
Quick start
autoedit is available as a Python project as part of my smp audio codebase in Github. If you are on good terms with setting up a Python environment and installing a set of dependencies you are good to go with this.
If you are not inclined to setup the Python project, we have prepared autoedit to be used as a web app. To use it, you need to enter an API key. First time round, please be in touch with me to get a key. Once you have got the key, go to to https://api.jetpack.cl and paste your key into the form field you see on that site.
On the the site you will find four red tabs, Drop, Files, Actions, Tasks. Upload some files via the drop tab, either from your hard disk or from the internet via a Youtube or similar URL.
Go to the file tab to make sure your files are there.
Go to the Actions tab and select autoedit. In the autoedit configuration form, add an input file, set the duration to something small, 30 - 60 seconds. Set the assembly mode to random. Set the number of segments (numsegs) to something between 20 and 200.
Click start. Wait a bit and download the .wav result and ignore the other outputs. Don’t like it, rerun it, like it. Generate and test.
Why autoedit?
A lot more music is created every day than is actually published.
To some extent that is about the confidence that people have about the relevance of the music they make in their own sociocultural island. They play it safe and say, I’m just making this for my own pleasure. Granted.
Even for music producers that would describe themselves as such, and who have previously published music to critical acclaim, a large proportion of what is created remains dark in private storage and is not published, for a number of reasons.
One of these reasons is the effort of going the last mile, so to say, the effort to take a raw piece of music with some good bits in it, like a session recording, and from that make refined and publishable form. The reluctance to do so is understandable because it can become a lot of effort, even just screening the footage to get an overview of what is in there in a longer session.
What then, if this part of the production/publication cycle could be automated? Would we hear even more cool music than we do already because just more of what is made is actually published?
Why autoedit? II
In many areas of professional and recreational activity, you find yourself in front of long media files, often recordings of some session or event. Common to all these cases is the wish to shorten these files for publishing and easier consumption. Shortening means to cut the original recording into slices, leaving some or most of them out and picking a few to remain in the resulting shortened version.
Most often again, there is parts which do not contain any relevant information. While there is other parts where a lot of relevant information is packed into a small amount of time (think for example biomonitoring and wildlife recordings).
In all of these cases, the transition from one part to the next is marked by an onset of something new, which hasn’t been there before. If you automagically slice the original recording into chunks on each of these onsets, then select the relevant chunks, and glue them back together to form an output, that would help a lot. Example areas where this is the daily business is music production, publishing, event organization, or journalism.
Turns out that can be done.
How does it work?
TODO
What is autoedit?
Autoedit was invented collectively observing a session based production workflow. Visiting friends in the stu on a regular basis, we observed that capture (recording the stuff) was being handled, yet consolidating or even publishing it was not.
A lot of thinking ticks later, we arrived at a workable proposition consisting of a collection of Python scripts that do the job. A couple of run-ins with potential users and going through their individual installation processes told us to create a web app and provide autoedit in style of Software-as-a-Service (SaaS).
If you’re in a hurry, enter https://api.jetpack.cl
Questions
what happens if you tell autoedit to create a track longer than what you give it? does it just repeat segments?
yes
how do i credit droptrack on my release?
please use
“Sensorimotor Primitives Auditory Perception Software by Jetpack Cognition Lab”
Design philosophy
What we are talking about is music cognition and what we use to create that is a set of libraries from the Music Information Retrieval (MIR) community [1]. There exists significant amount of music cognition ability in these libraries.
We set out to create a segmentation of the input file into a set of different parts. Part in the sense of where you and I would agree that these do constitute different parts of any given piece of audio recording. Twenty seconds of the main voice active, 40 seconds of no bass drum, 12 seconds of a single voice. That kinda stuff, mute on / off on the desk. In addition we want the large scale segmentation to be aligned on the beat grid, if the audio has a beat (every audio has some degree of beat but don’t tell anyone, else confusion and cost rained down on us).
Now there is some realization from that we can put into action:
1/ combine the functionally overlapping feature detectors of different MIR libraries to obtain an improved result
2/ use philosophy to utilize onset type features on two different time scales
3/ combine two time scales to make beat aligned segment boundary inferences
4/ use random strategies, they are cheap and always worth it.
(Idea can be extended to other domains, that’s another post).
Footnotes
[1] ISMIR Community, TODO insert link
Comments
tag cloud
robotics
music
AI
books
research
psychology
intelligence
feed
ethology
computational
startups
sound
jcl
audio
brain
organization
motivation
models
micro
management
jetpack
funding
dsp
testing
test
synthesis
sonfication
smp
scope
risk
principles
musician
motion
mapping
language
gt
fail
exploration
evolution
epistemology
digital
decision
datadriven
computing
computer-music
complexity
algorithms
aesthetics
wayfinding
visualization
tools
theory
temporal
sustainability
supercollider
stuff
sonic-art
sonic-ambience
society
signal-processing
self
score
robots
robot-learning
robot
python
pxp
priors
predictive
policies
philosophy
perception
organization-of-behavior
open-world
open-culture
neuroscience
networking
network
navigation
movies
minecraft
midi
message-passing
measures
math
locomotion
linux
learning
kpi
internet
init
health
hacker
growth
grounding
graphical
generative
gaming
games
explanation
event-representation
embedding
economy
discrete
development
definitions
cyberspace
culture
creativity
computer
compmus
cognition
business
birds
biology
bio-inspiration
android
ai
agents
action