ORIGINAL COMMAND LINE FORMAT
INFORMATION PAGE
CONTROLLING PARAMETERS WITH FUNCTIONS
ECMC script exists: No ECMC script exists:BASIC ROUTINES
ANALYSIS ONLY
PVANALYSIS (PVCANAL)
FREQRESPONSE
AMPLITUDE WARPING
NOISEFILTER
COMPANDER
SPECTWARPER
BANDAMP
ADDITIVE SYNTHESIS
HARMONIZER
INHARMONATOR
CHORDMAPPER
SUBTRACTIVE SYNTHESIS
CONVOLVER
FILTER
CHORDRESPONSEMAKER
FILTRESPONSEMAKER
TVFILTER
RESONANCE/REVERB
RING
RINGFILTER
RINGTVFILTER
NONLINEAR FREQUENCY DEVIATION
FILTDEVIATOR
TVFILTDEVIATOR
FEATURE EXTRACTION
ENVELOPE
CENTROID
FLUXOID
PITCHTRACKER
OVERLAP/ADD METHOD VS. OSCILLATOR BANK METHOD AND RESYNTHESIS THRESHOLDS
SOURCE
MULTIPLE CHANNELS
FLOATING-POINT AMPLITUDE RESCALING
OUTPUT STATISTICS
FREQIUENCY RESPONSE TERMINAL OUTPUT
ANALYSIS FILES
DECIBELS
LOW/HI SHELF EQUALIZATION
WARP INDEX
PITCH TRANSPOSITION
FREQUENCY SHIFT
ENVELOPE RESPONSE TIME
RING DECAY TIME
FFT SIZE
WINDOW SIZE
WINDOW TYPE
FRAMES PER SECOND
TIME EXPANSION/CONTRACTION
BEGIN/END TIMES
GAIN
FILTERING: SOURCE SIGNAL LEVEL
TRANSPOSITION/SHIFT APPLICATION FLAG
FILTER TYPES: PASS OR REJECT
RESPONSE FUNCTION SMOOTHING
ANALYSIS DATA ACCESS MODE
CONVOLVER PANPOT
FREQUENCY RESPONSE ACCUMULATION METHOD
RING ROUTINES: FILTER PLACEMENT
COMPRESSION AND EXPANSION
UTILITIES
FILE CONVERSION: aiffs, aiffd, nexts, nextd, nextfloats
FUNCTION VIEWING: showme, showspect
PVC is a collection of phase vocoder signal processing routines and accompanying shell scripts for use in the transformation and manipulation of sounds. It is written in C and designed to be used in a UNIX environment. It has come about as a result of my path of education and research into phase vocoder technology. It follows in the spirit of the work by Eric Lyon (out of which PVC is built) and Chris Penrose whose particular dsp research springs from the coding and tutorial work of F.R. Moore and Marl Dolson. Moore's book, Elements of Computer Music, published by Prentice Hall, is therefore a great resource for making sense of the phase vocoder engine which I am unable to go into here. Curtis Road's book, The Computer Music Tutorial, published by MIT Press, has sections on the phase vocoder as well; these may better introduce the beginner to the practical concerns of this technology. Short of the explanations these sources provide, I have attempted to offer below some explanations, particularly as needed for control of the parameters in these routines. A manual and tutorial would be great to have; unfortunately time has not yet made it so.
These routines reflect my need for tools which can perform different spectral resynthesis tasks; both simple and experimental. Their refinement has advanced with my growing skills and curiosity, which I expect will continue as long as I have questions about sound. Most of these routines can be viewed in terms of traditional additive or subtractive synthesis tasks, coming about as they did from the desire for greater finesse and control of these two basic types of synthesis. While the speculative nature of some give them an idiosyncractic character, most should, with practice, reveal the transparency of their names if not the role they can play in the shaping of sound. All require a good ear tuned towards sound and idea as none of these routines are automatic, although many hold great potential for the diligent.
This 3.0 release contains only those routines which I think are stable, useful and moderately transparent. Some earlier versions have been omitted, replaced or consolidated into newer routines. For example, compander remains, but the ideas behind bandamp have ripened into spectwarper, a remarkable "super companding" tool for windowing amplitude, and balancing the resonance/noise-residues of a sound. The harmonic tone reorganizer, chordmapper, has continued to grow in its controls (however arcane), offering increasingly subtle ways to reorganize harmonic spectra. The noisefilter routine is now very good, having become a PVC first encounter routine for many whose noisy lives cross my path. Tvfiltdeviator now joins the arcane but novel filtdeviator routine. In addition, I have added a set of feature analysis routines (pitchtracker, centroid, envelope, fluxoid); which should be useful in generating function files to control different synthesis strategies. There are other, more experimental routines (some actually appeared in 2.0) which are still proving themselves; in time they will appear or reappear. As with 2.0, floating-point files (combined with a rescale feature) continue to be readable and writable. Someday I will deal with AIFF headers (although they do not offer floating-point values), but not for now.
Paul Koonce
koonce@music.princeton.edu
This version of the PVC html documentation has been edited from Paul Koonce's
original PVC 3.0 manual by Allan Schindler to
reflect usage of the PVC programs at the Eastman Computer Music Center.
Some portions of Koonce's original documentation have been omitted,
other portions shortened, edited or reworded, and I have placed some passages, like the paragraph above,
in a smaller font. These small font passages can be skipped by ECMC users who are just
beginning to use the PVC programs, but may be of interest to more advanced users.
Paul Koonce's complete original version 3.0 HTML documentation on the PVC programs
is available at
http:/www.esm.rochester.edu/onlinedocs/PVC.3.0.README.html
and at
http://www.music.princeton.edu:80/winham/PSK/PVC.3.0.README.html
Within the ECMC version you are reading, information specific to using these
programs at Eastman,
like this passage, is printed in green font in the online version.
To make these ECMC annotations easier to spot in the printed grayscale version of this
document. the section
headers of these ECMC-specific annotations are enclosed in asterisks, as in the
At the ECMC, we are running Koonce's PVCX on wozzeck, the Mac Pro in the MIDI studio, and a port of Koonce's PVC 3.0 programs written by John Gibson on our Linux systems.
The PVC package includes about 20 separate programs, or "routines," briefly described in the
However, while an initial glance at the PVC scripts may seem daunting,
things often are not so bad after all. All of the parameter options are provided
with default values. In general, if we accept all of the defaults, the result
will be straight resynthesis, and if all goes well the output soundfile should sound
identical to the original.
In your initial attempts at analysis/resynthesis, you
can skip over most of the parameter options, relying upon the default
values, and change only those parameters necessary to achieve the desired
musical result. Often, the result will be fine. But if not, you then can
adjust other parameters in attempting to achieve a more satisfactory
result or to eliminate artifacts.
(*** End of this ECMC note ***)
All of the executable PVC programs, like plainpv and twarp, originally were designed to be run from a shell window with the standard Unix command line syntax:
Information about any routine can be seen by typing the name of the routine without any arguments or file name. Typing:
produces the following information about plainpv.
plainpv: generic phase vocoder with dynamic controls
plainpv [flags] [input file (16-bit shorts)] [output file (optional)]
(values in brackets denote defaults)
N: FFT length (must be a power of 2) [1024]
M: window size in samples (must be a power of 2) [2*FFT]
(0 will automatically set window to 2*FFT size or larger)
w: window type: 0 = hamming, 1 = rectangular
2 = Blackman, 3 = Bartlett triangular [0.]
4-12 = Kaiser windows for alpha = 4-12, respectively
(representative sidelobe levels for alpha:
4 = -30dB, 8 = -58 dB, 12 = -90 dB)
D: analysis frames per second [200]
I: time expansion/contraction factor [1.]
(duration = duration * factor, 1. = original time)
P: pitch transposition in semitones (func) [0]
a: frequency shift factor
(bin frequency adder, before -P )(func) [0.]
b: begin time in seconds [0.]
e: end time in seconds ( 0. = end of file) [0.]
C: resynthesis channel (1 -> ?) (0 = all) [0]
SHELF EQ:(post transpose/shift)
H: SHELF EQ: Low shelf gain in dB (func) [0.]
X: SHELF EQ: High shelf gain in dB (func) [0.]
m: SHELF EQ: Low shelf frequency in Hz (func) [200.]
R: SHELF EQ: High shelf frequency in Hz (func) [2000.]
W: warp index for reshaping magnitude response (func) [0.]
Values > 0 expand the dynamic range,
values < 0 compress the dynamic range.
A: gain in decibels (func) [0.]
l: envelope attack time (func) [0.]
L: envelope release time (func) [0.]
T: BRICKWALL FILTER TYPE: 0 = bandpass, not 0 = band reject [0]
f: frequency window: low boundary
(before -P and -a) (in Hz) [0.]
F: frequency window: high boundary
(before -P and -a)(in Hz) [Nyquist frequency]
p: amplitude reports print mode: 0 = off, 1 = on [0]
i: time interval between amplitude reports [.25]
_: OUTPUT FORMAT: 0 = taken from input file
1 = 16-bit integer, 2 = 32-bit floats [0]
=: PEAK RESCALE LEVEL (float output only) 0 to -96 dB
Set to 1 to rescale to level of input file. [ 1 ]
TERMINAL DISPLAY AND GRAPH FILE OUTPUT
n: number of frames [0]
u: low bin frequency [-1]
U: high bin frequency
(-1 = nyquist) [Nyquist frequency]
S: TERMINAL DISPLAY: display option [0]
(0 = off, 1 = phase data, 2 = amp data, 3 = both)
c: GRAPH FILE: WRITE ascii to FILE
0 = off, 1 = freq, 2 = decibels [0]
3 = decibels - waterfall plot
(When on, this flag writes ascii point pairs
(with time frame on x axis) for plotting
with gnuplot.)
d: TERMINAL DISPLAY FILE NAME for -c [./ascii.out]
t: oscillator resynthesis threshold in decibels [ -96 ]
Parameters which have the word (func) on the info page just before the default as in:
W: warp index for reshaping magnitude response (func) [0.]
*** ECMC note: Information on using the function generating routines
is provided within the
GEN FUNCTION CONTROL OF PARAMETERS
section of this document.
(*** End of this ECMC note ***)
Therefore, to simplify usage of many (but by no means all) of the PVC
program by ECMC users I have created
local ECMC scripts, with the program name followed by the extension .tp, ("template")
based upon the shell script models provided by Koonce,
that can be used to run these
routines. To obtain a list of currently available ECMC templates
for PVC programs, type
As you can readily see from the INFORMATION PAGE output of plainpv above,
because these programs
have so many options, it is impractical to run them from a command line.
Instead, Koonce provides Bourne shell script interface files, and Gibson perl script
files, that can be used to run the actual binary programs.
Not surprisingly, neither Koonce's nor Gibson's scripts are ideally suited to
the way we have set up the ECMC SGI and Linux systems.
plainpv.tp syntax: plainpv.tp insound [outsound] [> scriptfile]
where "insound" is the name (and, if necessary, path) of the input
soundfile and the optional "outsound" argument is the name of the
output resynthesis soundfile. If the "outsound" argument is omitted
the output soundfile will be named "test."
After capturing this template in an ascii file and editing this
file, run plainpv with this script file with the command:
sh scriptfile
To see a "plainpv" template file without providing soundfile arguments type
plainpv.tp -
As this usage summary explains, if we simply wish to see what a plainpv.tp script file looks like, without providing any arguments, we can type
To obtain a script file to run plainpv using the soundfile /sflib/wind/fl.c4 as our input sound and to write the resynthesized output to a soundfile called pvcflutetest1 in our current working soundfile directory, we would type
To obtain a script to run pvanalysis use either the command pvanalysis.tp or else the alias pvcanal.tp. To obtain a script to run twarp, use the command twarp.tp, and so on.
In addition to these .tp script templates, I have created
example script files for many (but, again, not for all) of the PVC
programs. To obtain a listing of these example files, type
pvcex filename(s) or else getpvcex filename(s)To display one or more of these example PVC script files through the paging program "less," type:
pvcex filename(s) | less or else getpvcex filename(s) | lessTo capture one or more of these files, type:
pvcex filename(s) > outfile or else getpvcex filename(s) > outfilewhere outfile is the name you want to give to this file.
Soundfiles in the sflib/x directory exist for all of these examples except for a few that do not create soundfiles, but rather analysis files or some other type of file.
To learn how to use plainpv, the most basic program
in the
An ECMC help file called
pvc
summarizes the usage information above and can be consulted for
quick reference when using these programs.
*** ECMC Notes: *** When using the ECMC .tp scripts on the SGI systems, users are
relieved of the chore of converting input soundfiles from AIFF to NeXT
format, then converting the NeXT format output soundfiles to AIFF format.
The ECMC scripts handle these conversions automatically. Input soundfiles
should be in AIFF format. For each PVC job you run, the ECMC script
makes a temporary NeXT format copy of the input soundfile called pvcin,
which is used by the PVC program. The PVC writes its output to a temporary
NeXT format soundfile called pvcout. When the job is completed, the
script creates an AIFF format copy of pvcout, and then deletes this
temporary NeXT format soundfile. Note that because ALL SGI system PVC jobs
create an output soundfile called pvcout, you can only have one PVC
job at a time running on the SGIs. Since these programs are processor and
disk intensive, it actually would make little sense to try to run two or
three PVC jobs simultaneously anyway.
Phase vocoder jobs sometimes can take a long time to run.
The PVC programs do update the output soundfile headers frequently,
so that partially completed output soundfiles can be played before the job has completed.
This must be done carefully, however. First you must
suspend the PVC job
(so that it does not continue to append samples to the soundfile while you
are trying to play the soundfile) by typing
^z
(control z). Then to play the partially completed output soundfile:
(*** End of this ECMC note ***)
All internal processing in both the SGI and Linux versions of the PVC programs
is done in floats.
However, there is an important difference between the SGI and Linux distributions
of PVC:
Note that the default is AIFF output. However, by changing
outputformat=AIFF # for Linux only : specify AIFF or WAVE output format
outputformat=AIFF
you can obtain a WAVE format output soundfile, even
if the format of the input soundfile is AIFF.
(*** End of this ECMC note ***)
to
outputformat=WAVE
*** ECMC Notes: ***
After playing the partially completed soundfile one or more times,
resume compilation by typing
% or fg. To kill the job, type ^c (control-c) after resumpting compilation.
IMPORTANT: Even if the output is unbearably ugly,
do not forget to resume compilation and then kill the job.
If you do not resume compilation, the job will remain loaded in RAM.
to play this partially completed soundfile.
(*** End of this ECMC note ***)
RETURN TO INDEX
*** ECMC Notes: ***
Below is a listing of the routines contained in this release along with
a description of what each does. These programs are divided here into
two groups:
(*** End of this ECMC note ***)
*** (1) PVC Programs for which an ECMC .tp script exists: ***
Plainpv is a basic phase vocoder with control of pitch transposition, frequency
shift, time scale, amplitude warp and low/high shelf equalization. It also
has some nice controls for looking at the data produced by the phase vocoder.
*** ECMC Notes: *** At Eastman, obtain a shell script template for
this routine with plainpv.tp; edit this template, and then run the script
with the command: sh scriptfilename All example files listed here and below are are available
in the hardcopy ECMC PVC EXAMPLE FILES binder in the studios.
Twarp is like plainpv except that it works from an analysis file rather than a soundfile.
This allows you to move forwards/backwards through time according to a time function
file.
*** ECMC Notes: ***
To use twarp:
Example twarp6 illustrates
time point dithering, and is a mix of ECMC examples
twarp6-1 , twarp6-2 , twarp6-3 ,
twarp6-4 and twarp6-5
Pvanalysis is the time varying form of freqresponse that creates a phase vocoder analysis for use by other routines. The routines which
require pvanalysis files are twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator.
*** ECMC Note: *** At Eastman, use
pvanalysis.tp (or else the easier-to-type
alias pvcanal.tp) to obtain a template file to run this routine. Edit
this file, then type sh filename to create the analysis file.
See ECMC example files
pvanalysis.voicetest and
pvcanal2.
Freqresponse is a routine used by several others to prepare a spectrum for use with
routines that filter, compress or limit. The response can be normalized or not depending
on the needs of the routine which will use the response.
*** ECMC Note: ***At Eastman,
use freqresponse.tp to obtain a template
for freqresponse. After editing this template file, run the analysis
by typing: sh filename
Noisefilter filters out the noise in a sound by subtracting out a frequency response.
The frequency response is analyzed from a short segment in the file where noise alone is found. For sounds that do not have segments of isolated
noise, there is a threshold mode.
*** ECMC Note: ***At Eastman,
run this routine with noisefilter.tp,
but, good luck. After an hour or so of testing and fussing with this program,
I was unable to come up with any musical results worth listening to
or turning into an example file. Compander is a classic compressor/expander. What is different here is the use of
a peaks response file. The peaks response file is a frequency response,
analyzed from a segment of the sound, that is taken to represent the peak
bin amplitudes for the sound. Each frequency bin of the peaks frequency
response functions as the 0 dB reference point for that frequency bin. The
amplitude of the frequency bin is companded relative to this reference.
*** ECMC Note: ***At the ECMC,
the entire analysis/companding process can be run with a script
file provided by compander.tp.
However, currently there are no ECMC example files for this program,
which is complicated to use.
Spectwarper uses an expanded compansion scheme to highlight either a sound's stronger,
resonant components or its weaker noise/residual components. Spectwarper is fairly similiar to compander; however, unlike compander which compands bins against the constant peak of an input response file, spectwarper compands bins using a peak drawn (in the current frame) from a narrow frequency
band centered around the value being processed. This causes the compansion or "warping' of the
amplitudes to accentuate(expansion) or mask(compression) formants located within the frequency bands; the result being the noise/pitch highlighting mentioned earlier. Part of this
comes from the treatment of compression in Spectwaper. Unlike compander which only reduces the amplitude above the threshold when compressing, spectwarper reduces the amplitude of the entire range, becoming, in effect, an expander
of the strongest amplitudes that expands them (when the compression level
is severe) out of the picture. Spectwarper is one of my favorite routines of late simply because it provides such
a simple and powerful control over the noise and pitch characteristics of
a sound.
*** ECMC Note: *** ECMC users can obtain a script file to
run Spectwarper with spectwarper.tp
Bandamp
is an older PVC program, no longer included in the current PVC distribution.
(Its capabilities also can be realized with the newer spectwarper program.) However,
I still find bandamp useful, and it is still available
on the the ECMC SGI machines. THERE IS NO LINUX VERSION.
This program
is an amplitude windowing routine. Like compander, it uses a response
file, previously created with To use bandamp on the ECMC SGI systems:
ADDITIVE SYNTHESIS -- HARMONIZER, CHORDMAPPER, AND INHARMONATOR: These routines all allow for a kind additive synthesis based on the remapping
of phase vocoder data according to some model. Each requires an ascii data
file specifying how phase vocoder information will be replicated or mapped.
This mapping is constant for the run of the routine. Harmonizer works much like a commercial harmonizer in that it allows you to create
harmony against the source by adding a transposed copy of it. Here the concept
is extended by allowing for multiple harmonizations, each taken from a different
band of frequencies, output with seperate gain.
*** ECMC Note: ***At Eastman, run
this program with a script initially obtained with the
command harmonizer.tp Chordmapper lets you specify how harmonically related groups of partials will be replicated
or mapped to produce chords. An input data file organizes the remapping
into tone groups, and includes ways to tune or neutralize the frequency
deviations of partials. Time-varying control of these features is available
as well. You can use this routine to build up thick chords from single tones,
or to delicately reorganize a harmonic spectrum.
*** ECMC Note: ***At Eastman,
run this routine with a script file obtained with the chordmapper.tp command.
See example files AN ECMC help file on
chordmapper
also is available.
Inharmonator lets you specify how the partials of one fundamental will be remapped or
deviated. While the more recent and developed routine chordmapper is probably better for this task, I have decided to leave this routine
in for now. (Think chordmapper.)
*** ECMC Note: ***Well, okay, Paul Koonce doesn't seem to have much affection
for Inharmonator, but I have found this program useful.
NOTE: AS OF THIS WRITING INHARMONATOR IS BROKEN
ON THE ECMC LINUX SYSTEMS AND IS ONLY AVAILABLE ON THE SGI SYSTEMS.
This routine can be used to alter the ratios between the partial frequencies
("detuning" the sound),
and also the amplitude relationships of these partials. Because this
can be a powerful program, but complex to use, I have provided an
ECMC help file called
inharmonator
on its usage.
At Eastman, run this program with a script file obtained the with
the command inharmonator.tp.
See the example files In its setup and controll, convolver is the very
similar to
tvfilter. It's processing, however, is different. In tvfilter filtering is produced by multiplying the magnitudes from the polar form of the two analyses; leaving the phases (or frequencies) of the source intact while modifying the amplitudes
of those frequencies. Convolver goes a bit further by multiplying the two analyses in their Cartesian forms. This produces an intersection of the two spectra. Unlike tvfilter which produces a shadowlike intersection, shadowing the analysis file
characteristic onto the input sound file, convolver creates a true spectral intersection, allowing only that which is common
to both sounds to be heard. The effect is a sound which is somewhat garbled as it outputs the more
intermittently common spectral components of the two. The form of the multiplication
in convolver does not allow some of the filter transposition controls associated with tvfilter. There is however a convolution panpot which offers control of the mix between the convolution and source sounds.
*** ECMC Note: ***At Eastman,
use convolver.tp to create a script file to run this routine, and see the
example files
convolver1 ,
convolver2 ,
convolver3 ,
(a mix of two sources:
convolver3-1
convolver3-2) ,
convolver4 and
convolver5
Using the pvcex or getpvcex command,
see the following example files :
plainpv1, plainpv2, plainpv3, (a mix of examples plainpv3-1, and plainpv3-2),
plainpv4, plainpv5, and plainpv6 (a mix of examples plainpv6-1, and plainpv6-2)
Example plainpv7
imposes the amplitude envelope of a maraca roll on a gong tone. The maraca roll
envelope was created by ECMC PVC example envelope1
Example plainpv8
incorporates a pitch analysis file created by ECMC PVC example
pitchtracker1
The following ECMC example files illustrate various aspects and possibilities
of this program:
twarp1, twarp2 (a mix of example files twarp2-1 and twarp2-2),
twarp3, twarp4 (a mix of example files twarp4-1 and twarp4-2) and twarp5
See ECMC example files
freqresponse1 and
freqresponse2
Currently, there are no ECMC example files for spectwarper, and this
puppy, though powerful, is not so easy to use.
See ECMC PVC example file
bandamp1
(a mix of example files
bandamp1-1 and
bandamp1-2)
and example file
bandamp2. ]
See also example file
harmonizer1
(a mix of examples
harmonizer1-1 and
harmonizer1-2)
and example file
harmonizer2
(a mix of examples
harmonizer2-1 and
harmonizer2-2)
chordmapper1
(a mix of four source
soundfiles created by examples
chordmapper1-1,
chordmapper1-2,
chordmapper1-3 and
chordmapper1-4),
example
chordmapper2 , and
chordmapper3 , and
its four source files:
chordmapper3-1 ,
chordmapper3-2 ,
chordmapper3-3 and
chordmapper3-4 .
chordmapper3 and its four sources are very similar to inharmonator1
and its four sources, using
slightly different procedures to obtain almost identical results.
However, many (though not all) of the sound modification procedures
provided by inharmonator also can be obtained by using chordmapper.
inharm1
(a mix of example files
inharm1-1
-- inharm1-2
-- inharm1-3 and
inharm1-4) and
inharm2
(a mix of example files
inharm2-1
-- inharm2-2
-- inharm2-3
-- inharm2-4
-- inharm2-5)
*** (2) PVC Programs for which NO ECMC .tp script exists: ***
Almost all ECMC users can skip the small font descriptions of the following programs,
and jump ahead to the
FEATURE EXTRACTION
section.
Because no ECMC .tp utility exists to create script files for the
PVC programs that follow,
usage of these programs is more difficult for ECMC users: you will have
to create your own script files, based upon examples provided by Koonce and
located in the the directory
Filter is a very useful routine for filtering a sound by a frequency response. Filtering is achieved by first creating the frequency response through either synthesis or analysis, followed by filtering with filter. Synthestic responses are created using either chordresponsemaker (which synthesizes a spectrum as a collection of harmonic tones), or filtresponsemaker (which synthesizes a frequency response using lines and breakpoints). Analyzed responses can be made with freqresponse (which analyzes a sound file segment and constructs a response representing the peak or average amplitudes). Once made, the magnitudes of the FFT response are multiplied against the time varying magnitudes of the input sound's FFT. Filter allows time-varying control of the response shape (warp), transposition/shift, compansion, smoothing, and source/filter mix, making this a very useful tool for quickly manipulating the spectral characteristics of a sound according to your synthetic or analytic goals. The synthetic forms can be run with the scripts S.filter_with_chord_synthesis or S.filter_with_breakpoint_synthesis; the analysis-based form with S.filter_with_analysis. The analytic form is a powerful tool for bringing the color of one sound into the realm of another.
Chordresponsemaker is a routine that uses a collection of harmonic tones, variable in size, to create a synthetic frequency response. It is found in various filtering scripts.
Filtresponsemaker is a routine that uses breakpoints and straight lines to create a synthetic frequency response. It is found in various filtering scripts.
Tvfilter is the time-varying (tv) form of filter. Tvfilter uses a pvanalysis file to change the magnitudes of the input sound file. As it is with filter, tvfilter multiplies the magnitudes of the analysis FFT against the magnitudes of the input sound's FFT, while preserving the frequency/phase characteristics of the input sound. Preserving the phase of the input sound file results in a cross-synthesis which sounds like the input sound file covered or suppressed by the shadow of the analysis file. Like filter, tvfilter offers a variety of controls for manipulating the filter characteristic. The use of a phase vocoder analysis to represent the filter characteristic also makes possible the temporal control of the filter file (i.e. backwards/forwards control) as found with twarp. Run this using the script S.tvfilter.
Ring uses the phase vocoder to create an all-pass resonator. It works by structuring the FFT resynthesis as a bank of feedback filters that feed back the sinusoid of each bin in a strength proportional to the amplitude of that bin (after adjustment by global feedback controls). This allows the sound to "ring" in a way something like reverb or comb filter resonance. The difference from comb filtering is that with ring spectral resonance is created not through a collection of comb filters selected for their ability to resonate various pulse wave spectra, but rather, through an array of feedback filters (sized by the FFT) that resonate a sine wave spectrum while dynamically tuning their feedback frequencies to the frequencies of the input sound. In short, it creates a kind of "self resonance". Ring is a nice way of increasing the resonant pitch characteristics of a sound, although it has its weaknesses. Ring works best with larger FFT sizes as it is attempting to synthesize or accentuate the more pitched/harmonic characteristics of the sound; this is something larger FFTs, with their increased frequency resolution, handle better. Use of the Kaiser window, with its low sidelobe amplitudes, helps as well. In adition, there is a threshold for preventing the noise features of a sound from being resonated, plus an EQ which can be positioned to filter either the source input to the feedback loop, or the feedback return. Run this using the script S.ring.
Ringfilter marries filter with ring by allowing a frequency response to be imposed on the resonance created with ring. Ringfilter begins to look more like multiple-delay, comb filter resonance since the static frequency response selects which frequencies will feed back. What is unique here is that the frequency response can come from an analysis, allowing the input sound to be resonated by the average spectral characteristic of another sound. A synthesized frequency response can be used as well. Like the EQ in ring, the filter in ringfilter can be positioned to either filter the source input to the feedback loop, or the feedback return where it will have the effect of introducing the filter characteristic more slowly through the resulting variable rates of decay. Run ringfilter with S.ringfilter_with_chord_synthesis to create a synthetic frequency repsonse, and with S.ringfilter_with_analysis for an analyzed frequency response.
Ringtvfilter is to ringfilter what tvfilter is to filter; that is, it makes the filter in ringfilter time-varying. This is a sophisticated idea, that is, time-varying filtering of the resonance of a time-varying sound. The best characterization would be to say that Ringtvfilter imprints the shadow of one sound onto the reverb of another. Ringtvfilter requires some thought and finese in order to separate and articulate the evolutions of the source, resonance, and filter. The best results are created using dynamic, high-profiled source sounds, rich with transient noise; and more constant, pitch/harmonic sounds for the time-varying filter. Like tvfilter, ringtvfilter requires an analysis file. Run this routine using S.ringtvfilter.
The idea behind filtdeviator is to use a frequency response function to not only filter a sound (as with filter), but to to create a topology of frequency deviation working in correlation with the filter. Consequently, filtdeviator is filter with added parameters for specifying how the filter frequency response function will be mapped into the deviation of frequency. The added parameters set the base and peak deviation for how the response will be mapped into both pitch transposition and frequency shift, and how the function will be warped within the range set by these limits. Their is also a master (0-1) deviation control for globally controlling the deviation. All the controls of filtdeviator allow you to dynamically vary the presence and effect of amplitude filtering and frequency deviation, making filtdeviator an interesting routine for exploring the way filters can be used to impede/transform the resonant signature of a sound. Using small amounts of frequency deviation, with no amplitude filtering, and a sweeping transposition of the filter will produce an effect something akin to the commercial guitar phase shifter; larger amounts of deviation take it into another place entirely. Adding the correlated amplitude filtering conceals the deviation more (positioning it more at the edges of formants), producing a sound something like the floppy resonant behavior of slide whistles. The scripts to run filtdeviator -- S.filtdeviator_with_ chord_synthesis and S.filtdeviator_with_analysis -- are designed with frequency response synthesis/analysis sections like those for filter and ringfilter. Run this routine using either S.filtdeviator_with_analysis or S.filtdeviator_with_chord_synthesis.
Tvfiltdeviator is to filtdeviator what tvfilter is to filter; i.e. it uses a time-varying filter response in place of the constant one. This routine blows the lid off of what was unusual about tvfiltdeviator. It's great for making wacky sounds out of ones with nice, fixed harmonies. The best use is to use it to deviate itself. Try taking something like a harpsichord or guitar (pitched stuff with decay) and do an analysis of the sound with pvanalysis. Then use the analysis to deviate the same sound. What happens is the strength of each of the sound's components becomes a control over the frequency deviation of that component, one that causes the sound to go "sproing" whenever it has any amplitude. Makes tonal music sound really broken. Run this routine with tvfiltdeviator.
Envelope is a routine for tracking the amplitude envelope of a sound. Output can
be ASCII, floats or a NeXT soundfile. Selecting floats or ASCII will produce
a file suitable for use in the control of a parameter.
*** ECMC Notes: ***
At Eastman, obtain a script to run this routine
with envelope.tp.
See example file
envelope1, which
is used in example plainpv7
Centroid is a routine for tracking the centroid of a sound. The centroid is the average
of all the frequencies weighted by their amplitudes. It essentially gives
you a kind of center frequency value for your spectrum. The analysis can
be restricted to a band of frequencies, allowing the centroid to track a
particular frequency component (although pitchtracker can do this as well). Selecting floats or ASCII will produce a file suitable
for use in the control of a parameter.
*** ECMC Note: *** ECMC users can use centroid.tp to obtain a
template script file, edit this file and then use it to run centroid.
Currently, there are no ECMC example files for centroid.
Fluxoid is a routine for tracking the average frequency change of a sound. The average can be weighted (best) or not by the amplitudes.
Selecting floats or ASCII will produce a file suitable for use in the control
of a parameter.
*** ECMC Note: *** ECMC users can use fluxoid.tp to obtain a
template script file, edit this file and then use it to run fluxoid.
Currently, there are no ECMC example files for fluxoid.
Pitchtracker is a routine for tracking the fundamental pitch trajectory of a sound. It
is an experimental routine that works, I believe, but forever has its quirks.
Three detection methods are available for following the 1) fundamental of
the harmonic collection, 2) the strongest formant, or 3) a band-limited
centroid. Different output formats let you see, hear and eventually use
the fruits of your pitch tracking.
*** ECMC Note: *** ECMC users can use pitchtracker.tp to obtain a
template script file, edit this file and then use it to run pitchtracker.
See example file
pitchtracker1.
The analysis output produced by this example is used
in example plainpv8
CONTROL FUNCTION PROCESSING : RESHAPE
Reshape is a routine for transforming function streams to meet the needs of different parameters. It takes a headerless float or ASCII function file as input and outputs a headerless stream of float or ASCII values. With the appropriate flags, it can be used to limit, resample, translate, warp, expand, shrink, invert, quantize, and lowpass filter the input values. The output can be translated into different amp or pitch units depending on your needs. Run reshape at the command line.
>
*** ECMC Note: *** reshape generally is used in a pipe after
a gen routine to
remap the values created by the gen routine to some new maximum and
minimum range.
For usage examples, see ECMC
example files (e.g. twarp3, twarp4-1, twarp4-2 and inharm2-3.)
(*** End of this ECMC note ***)
##############333
Below are various terms, parameters, or ways of doing things which are common to many of the routines.
OVERLAP/ADD VS. OSCILLATOR BANK METHODS AND RESYNTHESIS THRESHOLDS:
The phase vocoder resynthesizes the signal using one of two methods, depending on the type of changes made to the FFT. If the changes are only to the magnitudes (amplitudes), then the faster overlap/add method is used. If however changes in frequency are made, then the FFT integrity is compromised, necessitating use of the oscillator bank method in which each bin is synthesized as a sine wave changing in frequency and amplitude. This method is slower, although a resynthesis threshold is available which can be used to increase the computation speed by turning off bins whose amplitude falls below the threshold. A threshold of -60dB is appropriate, although safety warrants using a lower threshold if the spectrum is thin and its decays exposed; use your ear.
The source sound is the original input sound. Some routines allow for the mix of the processed sound with the original source sound.
All routines allow both monophonic and multi-channel input files to be processed. With multi-channelled files, you can either select one channel and produce a monophonic output file, or process all the channels. Channels are numbered beginning with 1. Processing of multi-channelled files is done one channel at a time beginning with channel 1, with zeros written to channels which have yet to be processed. Prcessing one channel at a time requires less memory and allows you to audition the output sooner than if you did all channels at once.
FLOATING-POINT AMPLITUDE RESCALING
Selection of the floating-point, output-file format invokes an amplitude rescaling feature. Once processing is complete,
a second pass through the sound file is made to rescale the values to the
decibel level specified. A dB rescale level of 1 causes rescaling to the
level of the original input file.
*** ECMC Note: *** Most ECMC users will never use the floating point
option, and thus will never use this rescaling option, although I have included
it near the bottom of the user parameter section of the ECMC .tp script files.
Two flags are provided for controlling the output amplitude statistics; one turns the statistics on or off, and the other sets how often they will be reported. The statistics provide the peak output level in amplitude and decibels. Wth integer format ouput files, ouput values exceeding the normalized peak amplitude of 1. (0 dB) are clipped to a value of 1.0, and the statistics placed in clip mode; in clip mode reports are made only for frames where clipping occurs. The peak amplitude, its time, and the number of clipped samples are reported at the end of processing. With floating-point format output files, ouput values exceeding the normalized peak amplitude of 1. are not clipped since they will be rescaled in the second pass; output statistics proceed normally throughout. The levels before and after rescaling are reported at the end of processing.
FREQUENCY RESPONSE TERMINAL OUTPUT
In many filtering or companding routines, a crude terminal print of the frequency response is a available. A flag sets the high cutoff frequency for this output; a value of 0 (0 Hz) turns printing off.
Analysis files are binary, 32-bit floating-point files written by pvanalysis, containing frames of FFT analysis data for one or more channels. Analysis file data is preceeded by a header containing information about the analysis. Analysis files are much larger than the sound files they represent, and increase in proportion to the FFT size used. As such, files can become very large, so it is advisable to only make them when needed unless you have disk space to spare.
Amplitude is always handled in decibel units. The greatest magnitude of the 16-bit short integer is equated with an amplitude of 1.0 or 0 dB. 0 dB functions as unity gain, and the peak amplitude in issues of compression, expansion, and amplitude windowing. A change of +/- 6 dB represents a doubling or halving of the amplitude. Increments of 10 dB are loosely associated with one change in dynamic level. 16-bit shorts allow for a 96 dB dynamic range. Take care not to loose signal level as a consequence of processing since quantization noise will emerge when you attempt to regain your signal level by amplifying the integer sound file.
Equalization has been provided at various points in routines to allow for the needed adjustment of spectra. The EQ consists of low and hi shelf segments, whose width is adjusted through control of the shelf breakpoint frequency. The region between the shelf segments is represented by a linear decibel gradient between the decibel levels of the two shelves. Some routines implement the EQ before pitch changes, others after. EQ placed before pitch changes (pre-transpose/shift) will cause the EQ to be transposed with the pitch changes, whereas afterwards (post-transpose/shift) will keep them fixed as shifts and transpositions occur.
Many of the routines employ the principle of warping in which a distribution of values is transformed by an identity function. In these places an exponential function is employed to remap a 0-1 range of values into a new orientation that preserves the minima (0) and maxima (1) while bringing the distribution closer to either extreme as a result of the curvature of the exponential function selected. The curvature of the exponential function is selected through a warp index. Specifically, warp index w will reorient the input x through the function below (^ = exponentiation).
y = (1. - (e^(x * w))) / (1. - (e^w))
In this function, the warp index of 0 produces a linear function and an untransformed output. Positive warp index values of increasing magnitude produce curves of increasing concavity (increasing slope) that draw values towards the 0-valued minima, and reduce the function integral. Negative values do the opposite, drawing values towards the maxima of 1, increasing the integral.
The practical use of this mechanism is found in various places. One such place is the reshaping of the frequency response distribution characteristics. In this, positive warp indeces cause the peaks of the response to be accentuated while the weaker frequencies are expanded out (i.e. pushed towards 0). Negative values have the opposite effect as they compress the dynamic range of the response and raise the relative level of the weaker noise components. Another place where warp applies is in the remapping of FFT amplitudes through the spectrum warpshape. In this, the sucessive FFT frames have their amplitudes remapped by the identity function, similiarly expanding or compressing the dynamic range depending upon the warp specified; 0 (linear warp function) leaves the amplitudes unchanged.
With the pitch transposition control, a constant or function value is multiplied against all bin frequncies. This is classic transposition, here specified in semitones of transposition (12 semitones equals an octave). Conversion is made to produce the appropriate frequency multiplier.
With the frequency shift control, a constant or function value is added to all the bin frequencies to produce a nonlinear pitch domain translation of the spectrum. Frequency shift is related to things like ring modulation and their similarly nonlinear shifts of pitch characteristics. Use this to create small distortions of the harmonic integrity of a sound.
The rate at which amplitude changes are allowed to occur effects how smooth spectral evolutions will be. To control this, many routines contain attack and decay response times controls: once translated these controls manipulate the coefficients of the following filter.
y(n) = (1. - A) * x(n) + A * y(n)
The filter is a lowpass designed to increasingly smooth the sudden changes in a signal as the value of the coefficient, A, is increased. Its control is through the response time parameter which is the time in seconds it takes a signal, shifting from one state to another, to decay to -60 dB of its former state. Response times are transformed to create the necessary coefficients for the selected frame rate. The response time is separated into attack and decay; this allows seperate control of the smoothing of the signal depending upon whether it is increasing or decreasing in amplitude. Short attack/decay response times can be used in places where dynamic processing induces garble or even pops. You can use longer response times to generally smooth or blur the onset/offset of sound components, particularly if the response controls are being applied to a time-varying filter. When applied to amplitudes, longer decay respsonse-times do not sound good, for in their delay of the decay, they end up amplifying the residual noise of a sound.
Decay time is an issue in the feedback of the ring routines. Like response time, it is the time it takes the signal to decay to -60dB of its former state, or better, the time it takes the reverb to decay to -60dB.
The FFT size must be a power of 2. Larger FFT sizes resolve frequencies better but transient behavior more poorly. Choose your FFT size according to the sound you are working with. A size of 1024 or 2048 works well in most cases.
The window size is a less opaque parameter; like the FFT, it must be a power of 2. Windows which are twice the size of the FFT work well. Larger window sizes may resolve frequencies better. Specifying 0 for the window size will automatically set the window to twice the FFT size, a feature I have always used.
The FFT and inverse FFT are computed using a window. Like the FFT size, the shape of the window used can effect the quality of the analysis and resynthesis. (See F.R.Moore, Stieglitz, or Roads for further explanation.) A variety of windows are available including: Hamming, Rectangular, Blackman, Triangular, and Kaiser (in 8 different forms as related to 8 different alpha values). Blackman (-w2) or Kaiser (-w8) are reccomended for most applications. In some unusual cases where transient behavior is being lost, consider using other windows such as the Rectangular, although take care to assure that it is not producing pops or a buzzy sound.
This controls how often the phase vocoder will perform an analysis on the signal. It is a translation of the classic decimation control which specifies how many samples to skip between analysis frames. More frames increases the resolution of time but decrease speed. 200 frames per second is a good reference point. If you expand time you should increase this proportionately to maintain about 200 or more frames per second.
Once the spectral modifications are made to the FFT analysis, an inverse FFT is invoked to produce the samples of a time-domain signal. The classic phase vocoder paradigm controls the number of samples through the interpolation value and its relation to the decimation. The arcane relationship of decimation and interpolation is here translated into the parameter of time expansion/contraction, allowing for the direct scaling of time. Use values greater than 1 to expand time, less than 1 contract it.
Processing may be performed on an entire file or a segment of it by specifying begin and end times. End times less than or equal to 0 default to the end of the input file.
The output and other components can be gained. 0 dB represents unity gain, no change. See decibels.
FILTERING: SOURCE SIGNAL LEVEL
The mix of source and filtered sounds in the filter routines can be controlled by the source decibels floor. This value, taken from the -96 to 0 dB range, specifies the level of the source signal. The filtered signal level is equal to (1 - source amplitude floor). Consequently, the source level functions as a floor over which lies the filtered signal. A source floor of 0 dB would neutralize filtering since there would be no filter range above the floor, a floor of -96 dB would produce the full effect of the filter.
TRANSPOSITION/SHIFT APPLICATION FLAG
Filter routines which allow for transposition and frequency shifting of both filter and source have a flag which specifies whether transposition/shift should be applied before or after filtering. If it is applied before, the pitch transposition trajectory will evolve independent of the filter's trajectory of transposition. If it is applied after, then the pitch transposition trajectory will be added to the filter transposition trajectory, causing the filter to move in parallel with the pitch transposition movements plus any movements the filter transposition parameter adds.
Filters can be toggled to use frequency responses in pass or rejection mode. In pass mode, the response's stronger magnitudes are used to pass source through the filter; in rejection mode, to impede or reject components. In rejection mode, the response is created by inversion in the decibel range, not amplitude. In time-varying filtering (tvfilter), rejection can be in mode 1 in which the response is inverted against a constant 0 dB peak, or in mode 2 in which the response is inverted against the current analysis frame's peak amp. Spectral warping is always applied after the response has been transformed by rejection.
Many routines which use frequency response files to filter or warp amplitudes have a control which allows the response to be smoothed. The smoothing is produced by replacing the magnitude of a frequency bin with an average taken from a band centered around that bin. The degree of smoothing is controlled through manipulation of a bandwidth value, specified in octave units. Larger bandwidths produce greater degrees of smoothness, 0 turns smoothing off.
Routines which use analysis data made with pvanalysis -- twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator) -- access data the same; using the time-point, rate, and data window boundary parameters, set to function in either rate or explicit mode. In rate mode, the rate determines the speed of movement through a data file; the time-point sets the starting position. The rate may be positive (forward in time) or negative (backwards in time), and vary according to a function. Explicit mode uses the time point parameter to specify exactly where the analysis data should come from (units here are in the time of the analyzed sound). (Explicit mode does not use the rate control, and makes sense only if the time-point is controlled with a function.) Both rate and explicit modes abide by the upper and lower data window boundaries which delimit the data range. When the time-pointer moves beyond the specified upper and lower time boundaries, it re-enters the window from the other end, making the window into a circular/modular structure. The boundaries can be controlled with functions as well, giving this mode an expressive dimension far surpassing the time expansion/contraction parameter. There is also an auto-stop feature that, when turned on, causes processing to stop when it reaches the end of the analysis.
The convolver routine has a unique panpot mechanism for controlling the mix of input sounds (A and B) with their convolution. The panpot is a crossfade mechanism that uses a -1 to 1 control range to accentaute either sound A, B or their convolution. A value of -1 produces an output consisting entirely of sound A, a value of 1, sound B. The 0 between these extremes produces the convolution of A and B. Values between these points produce a crossfade mix of either A or B and the convolution. For example, a trajectory from -1 to 1 would crossfade from sound A into the convolution, and on to sound B. Separate gain controls for A, B and the convolution make it possible to tune the continuity of this trajectory. In addition, the presence or spread of the convolution into the crossfade range can be tuned with the domain warp controls. The domain warp reshapes the movement through the crossfade range, allowing you to create a more gradual approach from A or B into the convolution center. This is achieved through a simple nonlinearizing of the crossfade domain in warp index style. Increasingly positive domain warp values (specified independently for each side) transform the linear trajectory towards the convolution into a decellerating one, causing the subtle mix area around 0 to be expanded. Therefore, if you want to hear more convolution in your crossfade, increase the panpot domain warps.
FREQUENCY RESPONSE ACCUMULATION METHOD
Several of the response-producing routines have the option of accumulating the response by either peak or average means. Whereas peak responses represent the record of a sound's thresholds (or synthesis specification's highest values), average responses represent the most common characteristics. If the sound you are analyzing has intermittent moments of sound whose peak characteristics you wish fully represented in the response, use the peak mode; otherwise use the average.
RING ROUTINES: FILTER PLACEMENT
Ringfilter and ringtvfilter (for which there are no ECMC .tp scripts) use frequency response functions to filter the reverb. Two filtering modes are available in which either the source input to the feedback if filtered, or the feedback. When the response is used to filter the source input, it filters the signal before it enters the feedback mechanism, imposing its characteristic, from the start, on the feedback. However, when positioned to filter the feedback component, the appearance of the respsonse's spectral characteristic, in the reverb, appears gradually as the signal decays. In this mode, the time it takes the signal to decay into the response characteristic is controlled by an additional decay time associated with the filter.
Spectral compression and expansion play a role in many routines. Its implementation here is according to the traditional model that uses thresholds and magnitudes of compression/expansion to reduce or enlarge the dynamic range of a signal. With spectral compression, amplitudes that exceed the specified compression threshold are reduced by an amount determined by the decibels of compression (a multiplier of the bin's amplitude lying above the threshold). Expansion works in a similar fashion, except that it changes the amplitudes below, rather than above, the expansion threshold; this results in an expansion of the dynamic range as the bins falling below the threshold are made to cover a wider range.
The term companding or compander is a merging of the two names, useful in situations where they are both available. While compander is the most obvious example of a routine using companding, traditional compression can be found in several other routines that involve filtering. It is not uncommon, in those routines, to reduce the dynamic range of an analyzed frequency response, particularly if it is time-varying, since the goal in filtering is more about color than dynamic range.
In all routines that use some form of companding, the dynamic range of the unprocessed signal/response is assumed to lie between 0 and -96 dB; thresholds are chosen from within this range. The degree of compression or expansion, expressed in decibels, represents how much the signal lying beyond the threshold will be reduced. A value of -6 dB would halve the dynamic range above the threshold in compression, or double the range below the threshold in expansion.
Compander applies compression for each frequency bin separately rather than as a macro gain change. It does this by using a frequency response file, created with freqresponse, to establish a unique, 0 dB point of reference for every bin; using its unique point of reference, every bin is compressed or expanded.
FILE CONVERSION: aiffs, aiffd, nexts, nextd, nextfloats
The sound file conversion scripts: aiffs, aiffd, nexts, nextd, and nextfloats
are shell scripts available for converting sound files back and forth between
aiff and next formats, or from next to floats. They are all effectively
SGI scripts since they use the SGI sound file format conversion utility, sfconvert. Aiffs and aiffd take next integer files and write new aiff files, nexts and nextd the opposite; in addition aiffs and aiffd can be used to write new aiff integer files converted from next float files.
Nextfloats writes a new float file from a next integer file.The s or d following the aiff or next in the name stands for the action taken on the original file once the new
file is made; the s saves the original file (i.e. does not delete it), the d causes it to be deleted. Multiple files may be converted with the same run
of the command. Running the command without any input files will produce
a description of the routine.
*** ECMC Note: *** ECMC users probably will never
need to use these file format conversion utilities, since script files
provided by the ECMC .tp utilities take care of all necessary format
conversions. Most of the file conversion utilities mentioned here do not work
on the ECMC Linux systems anyway.
FUNCTION VIEWING: showme, showspect
Two graphing scripts are available for viewing functions and spectral data. You must have gnuplot installed on your computer to use them (Type gnuplot <CR> to see if you do). Showme is a simple script for viewing function files. Run without an input file for a description. Showme takes headerless floating-point or ASCII (give -a flag) function files and plots them. Showspect plots the file of FFT amplitude or frequency data produced by the plainpv script, S.plainpv_with_printout_and_graph_files. Showspect is useful for seeing a graphic representation of a very particular part of an analysis, it is not a substitute for a standard spectrogram application.
GEN FUNCTION CONTROL OF PARAMETERS
Any parameter whose flag on the
routine's information
page has the word (func) after it
[or, within an ECMC template script file, includes
the comment # int, float or FUNC]
can be controlled by a
function file.
To make these files, complete CMUSIC gen command lines are inserted
into a script, like this:
gen4 -L1000 0 -3 0 1 3 > $SFDIR/ptrans ;
The file $SFDIR/ptrans created by this sample command contains floating point values representing the trajectory which the pitch transposition should take. The creation of gen routine function files within a PVC script file is used to specify time-varying parameter values.Such function file definitions may be placed near the top of a script file that runs a PVC program, before the arguments to plainpv or whatever other PVC program is being used, and we may group all of these function file definitions together. Alternatively they can be created within the body of a PVC file, perhaps just before, or even just after, the parameter which they control.
Lines in shells can be continued onto new lines with the backslash, which comes in handy with gen functions. The above, for example, could be entered as:
Near the top of the ECMC .tp template files I have included
commented template lines for creating function tables with CMUSIC gen
routines: Many of the ECMC PVC example files, including plainpv5, twarp1 and harmonizer1,
include function table definitions. These function generating routines
are similar in several respects to the gen routine in
Csound. However, whereas
Csound stores function tables in RAM, PVC requires
that these tables be written to disk files. To create a function file, copy the appropriate gen routine
template line to a new line, removing the leading # comment
symbol, edit the line, and specify a file name for the output. Although
these files are fairly small -- typically 1 kb --
I recommend writing them to your $SFDIR ("current working
soundfile") directory, rather than to your current Unix directory
or to /tmp.
The gen routines you are most likely to find useful are those that
create time/value envelope shapes: gen1, gen3 and gen4.
A quick tutorial on gen1 through gen5 is provided below.
Those who want additional information on CMUSIC gen routines can
consult Appendix D in F. Richard Moore's Elements of Computer
Music text (on reserve at Sibley for CMP 421-2).
Excerpts from this appendix are included as an appendix to the hardcopy
ECMC PVC EXAMPLES binder available in the studios.
gen1 creates linear {straight line} segments, like Csound gen 7
Note: You cannot look at the values within these function tables,
since they are in binary format. If you would like to see the values
in a table, to make sure you are getting what you want, before you
run a job, remove the file redirect at the ends of lines like those
above and include an exit command:
##### Cmusic function file generator tempates #####
# gen0 normalizes function files previously created with other gen routines
# gen0 -Llength max < inputfuncfile > outputfuncfile
# gen1 creates linear {straight line} segments, like Csound gen 7
# gen1 -Llength t1 v1 ... tN vN
# gen2 generates harmonic waveforms from sine {a} & cosine {b} amps
# gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM N
# gen3 generates amp values & linear connections at equally spaced time points
# gen3 -Llength v1 v2 ... vN
# gen4 generates exponenetial segments; "a" values determine shape &
# depth of curve: 0 = linear, neg. = exponential, pos. = inverse expo.
# gen4 -Llength t1 v1 a1 ... tN vN
# gen5 is like Csound gen 9 : harmonic1/amp/phase harmonic2/amp/phase
# gen5 -Llength h1 a1 p1 ... hN aN pN
# gen6 generates a table of random numbers between +1 and -1
# gen6 -Llength
# cspline: smooth curve {cubic spline} interpolator
# cspline len_flag [flags] x0 y0 x1 y1 ... xN yN
# genraw reads in a previously created function file
# genraw -LN filename (where N is the length of the output function.)
# For a usage summary of "reshape" type "reshape" with no arguments.
##### End of gen routine function generator tempates #####
syntax: gen1 -Llength t1 v1 ... tN vN
Examples: Either of the following two lines would generate an identical result:
gen1 -L100 0 0 .5 2.5 1. 0 > $SFDIR/updown
gen1 -L100 0 0 50 2.5 100 0 > $SFDIR/updown
Result : the values ascend linearly from 0 to 2.5 half way through the table,
then descend from 2.5 to 0 during the second half of the table
gen1 -L100 0 0 .5 2.5 1. 0
exit
This will cause the table values to be displayed in your shell window.
syntax: gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM NExample:
gen2 -L100 1. 0 1/3 0 1/5 0 1/7 0 # > $SFDIR/squareResult: harmonics 1,3,5,7 in sine phase are created with proportions of a square wave
gen3 generates amplitude values and linear connections at
equally spaced time points
syntax: gen3 -Llength v1 v2 ... vNExample:
gen3 -L100 1 .35 1.2 0Result: Values decrease linearly from 1. to .35 1/3 of the way through the table, then increase from .35 to 1.2 at 2/3 way through the table, then decrease from 1.2 to 0 at the end of the table
gen4 can be a powerful but complicated envelope generating routine to use because one must specify 3 values for each breakpoint except the last, where only 2 arguments are necessary. These arguments for each breakpoint are:
time (t), value (v), and (a), which determines the slope of the curve between this breakpoint and the next.
Result: Values in the table move with an inverse exponential slope from 02 to 4. over the first 1/3 of the table, then from 4. to 2.5 over the second third of the table, then exponentially from 2.5 to 0 over the final third of the table.Syntax: gen4 -Llength t1 v1 a1 t2 v2 a2 ... tN vN Example: gen4 -L50 0 -2. 1 .33 4 1. .67 2.5 -1 1. 0 < $SFDIR/gliassando t1 v1 a1 t2 v2 a2 t3 v3 a3 t4 v4
gen5 is similar to Csound's gen9, generating harmonic (or, less often, inharmonic) waveforms. The user specifies one or more partials, and for each partial, three arguments: the partial frequency (as a multiplier of a fundamental of 1), it's relative amplitude (on a scale of 0 to 1), and its phase (between 0 and 360 degrees). The resulting table numbers typically have values between +1. and -1.
Syntax: gen5 -Llength h1 a1 p1 ... hN aN pNFour examples:
(1) gen5 -L1000 1 1 0 > $SFDIR/sine # sine wave
h1 a1 p1
Result: One cycle of a sine wave with values between +1. and -1.
(2) gen5 -L1000 3 1 90 > $SFDIR/harm3Result: Three cycles (the third harmonic) of a cosine wave (a sine wave with a 90 degree phase shift)
(3) gen5 -L1000 2 1 0 4 .5 0 7 .2 0 > $SFDIR/harm247
h1 a1 p1 h2 a2 p2 h3 a3 p3(4) gen5 -L1000 1 1 0 # | reshape -b0 -B1. # > $SFDIR/tempfunc
Result: A sine wave with values rescaled between 0 and 1.
From example file plainpv5:
gen4 -L1000 0 -90 0 \
.1 12 0 \
.8 3 0 1 -90 > $SFDIR/ampfunc
Two functions from example file plainpv6-1:
gen3 -L1000 5 5 -5 -5 5 > $SFDIR/spectrumfuncThe values remain at 5 during the first 1/4 of the table, then move linearly from +5. to -5. during the second quarter of the table. They remain at -5 during the third quarter of the table, then move linearly from -5. to +5. during the last 1/4 of the table.
gen4 -L1000 0 -2 1 \The floating point values in the file
.25 -2 1 \
.5 4 1 \
.8 4 1 \
1 2 > $SFDIR/pitchfunc
To avoid cluttering your soundfile directory with these temporary function files, include lines at the very end of a script file (after the OFFICE USE ONLY section) deleting these temporary files, as at the end of example file plainpv5:
Note: If a function definition you create contains an error that makes it impossible for the gen routine to create this function, an error message will be displayed, and the program will ignore this function and use the default values for the parameter(s) where this non-existent function is intended to be used. However, this error message will scroll by quickly near the top of the voluminous diagnostic messages from the program, and it is easy to miss this error. To do a test run of your function definition(s), place an exit command immediately after your function definitions, which will terminate the program at this point, without running the PVC analysis and resynthesis:
gen1 -L1000 0 0 .2 0 3.4 -96 3.68 -96 > $SFDIR/rampdownRun the program. If you get an error message, check your function definition(s) for errors, make an necessary corrections, and run the program again. If you get no error messages, remove the exit line and run the program.
gen1 -L1000 0 -96 .2 -96 3.4 0 3.68 0 > $SFDIR/rampup
exit
Below is a sample of the output from plainpv.
plainpv -N1024 -M0 -w2 -D400 -I2 -a-0 -P2 -A0 -C0 -t-96 -b0 -e0 -H0 -m200
-X0 -R2000 -L0 -l0 -W0 -T0 -f-1 -F-1 -_1 -=1 -p1 -i.25 /S1/t.snd /S1/cm.mix.snd
/////////////////////////////////////////////////////////////////////
---------------------------------------------------------------------
============================== PLAINPV ==============================
---------------------------------------------------------------------
========================== INPUT SOUNDFILE ==========================
INPUT FILE: FILENAME = /S1/t.snd
INPUT FILE: SAMPLE RATE = 44100
INPUT FILE: NUMBER OF CHANNELS = 2
INPUT FILE: DURATION = 2.770386
INPUT FILE: BEGIN TIME = 0.000000
INPUT FILE: END TIME = 2.770386
INPUT FILE FORMAT: 16-BIT INTEGER
========================== OUTPUT SOUNDFILE =========================
OUTPUT FILE: FILENAME = /S1/cm.mix.snd
OUTPUT FILE: SAMPLE RATE = 44100
OUTPUT FILE: NUMBER OF CHANNELS = 2
OUTPUT FILE FORMAT: 16-BIT INTEGER
OUTPUT FILE: DURATION = 5.540771
======================== ANALYSIS PARAMETERS ========================
FFT SIZE = 1024
*
FUNDAMENTAL ANALYSIS FREQUENCY = 43.066406
*
WINDOW SIZE = 2048
FRAMES/SECOND = 400
DECIMATION SAMPLES (samples between analysis frames) = 110
======================= RESYNTHESIS PARAMETERS ======================
TIME EXPANSION/CONTRACTION FACTOR = 2
*
INTERPOLATION SAMPLES (samples between resynthesis frames) = 220
*
OSCILLATOR RESYNTHESIS THRESHOLD (in dB) = -96.000000
*
GAIN (in dB) = 0.000
PITCH TRANSPOSITION (in semitones) = 2.000
FREQUENCY SHIFT (in Hz) = 0.000
*
ENVELOPE ATTACK TIME (in seconds) = 0.000
ENVELOPE RELEASE TIME (in seconds) = 0.000
*
SPECTRUM WARPSHAPE INDEX = 0.000
*
FREQUENCY WINDOW: LOW BOUNDARY = 0.000000
FREQUENCY WINDOW: HIGH BOUNDARY = 22050.000000
*
*............. LOW/HIGH SHELF EQ............*
LOW SHELF FREQUENCY = 200.000
.......... LOW SHELF DECIBELS = 0.000
HIGH SHELF FREQUENCY = 2000.000
.......... HIGH SHELF DECIBELS = 0.000
*...........................................*
*
=====================================================================
ANALYSIS: CHANNEL = 1
..............USING BLACKMAN WINDOW
.....USING OSCILLATOR BANK RESYNTHESIS
*********************************************************************
** PEAK AMPLITUDE STATISTICS **
*********************************************************************
TIME PEAKAMP DECIBELS (LAST DECIBELS PEAK)
*********************************************************************
( 0.00 - 0.25) 0.0005 -66.295 -66.295
( 0.25 - 0.50) 0.2052 -13.754 -13.754
( 0.50 - 0.75) 0.3285 -9.668 -9.668
( 0.75 - 1.00) 0.3066 -10.269
( 1.00 - 1.25) 0.3176 -9.962
( 1.25 - 1.50) 0.2731 -11.275
( 1.50 - 1.75) 0.2655 -11.518
( 1.75 - 2.00) 0.2416 -12.337
( 2.00 - 2.25) 0.2930 -10.661
( 2.25 - 2.50) 0.2915 -10.707
( 2.50 - 2.75) 0.3067 -10.267
( 2.75 - 3.00) 0.4094 -7.757 -7.757
( 3.00 - 3.25) 0.3076 -10.241
( 3.25 - 3.50) 0.2841 -10.930
( 3.50 - 3.75) 0.2843 -10.924
( 3.75 - 4.00) 0.3241 -9.786
( 4.00 - 4.25) 0.3340 -9.524
( 4.25 - 4.50) 0.3612 -8.845
( 4.50 - 4.75) 0.3113 -10.136
( 4.75 - 5.00) 0.3094 -10.189
( 5.00 - 5.25) 0.3141 -10.058
( 5.25 - 5.50) 0.1142 -18.846
============= PEAK AMPLITUDE ========================================
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
1 2.898 0.4094 -7.757
*********************************************************************
=====================================================================
ANALYSIS: CHANNEL = 2
..............USING BLACKMAN WINDOW
*********************************************************************
** PEAK AMPLITUDE STATISTICS **
*********************************************************************
TIME PEAKAMP DECIBELS (LAST DECIBELS PEAK)
*********************************************************************
( 0.00 - 0.25) 0.0004 -67.948 -67.948
( 0.25 - 0.50) 0.2301 -12.763 -12.763
( 0.50 - 0.75) 0.2477 -12.122 -12.122
( 0.75 - 1.00) 0.1969 -14.115
( 1.00 - 1.25) 0.2631 -11.599 -11.599
( 1.25 - 1.50) 0.2086 -13.613
( 1.50 - 1.75) 0.2559 -11.840
( 1.75 - 2.00) 0.2671 -11.465 -11.465
( 2.00 - 2.25) 0.2768 -11.157 -11.157
( 2.25 - 2.50) 0.1762 -15.082
( 2.50 - 2.75) 0.2113 -13.502
( 2.75 - 3.00) 0.2549 -11.872
( 3.00 - 3.25) 0.2673 -11.460
( 3.25 - 3.50) 0.2869 -10.847 -10.847
( 3.50 - 3.75) 0.2841 -10.931
( 3.75 - 4.00) 0.1991 -14.019
( 4.00 - 4.25) 0.2131 -13.427
( 4.25 - 4.50) 0.2540 -11.904
( 4.50 - 4.75) 0.2235 -13.014
( 4.75 - 5.00) 0.2407 -12.369
( 5.00 - 5.25) 0.2941 -10.629 -10.629
( 5.25 - 5.50) 0.1166 -18.667
============= PEAK AMPLITUDE ========================================
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
2 5.103 0.2941 -10.629
*********************************************************************
=====================================================================
PEAK AMPLITUDES: ALL CHANNELS
---------------------------------------------------------------------
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
1 2.898 0.4094 -7.757
2 5.103 0.2941 -10.629
=====================================================================
PLAINPV: RESYNTHESIS COMPLETED