ECMC PVC Documentation

INDEX


Paul Koonce's INTRODUCTION

PVC is a collection of phase vocoder signal processing routines and accompanying shell scripts for use in the transformation and manipulation of sounds. It is written in C and designed to be used in a UNIX environment. It has come about as a result of my path of education and research into phase vocoder technology. It follows in the spirit of the work by Eric Lyon (out of which PVC is built) and Chris Penrose whose particular dsp research springs from the coding and tutorial work of F.R. Moore and Marl Dolson. Moore's book, Elements of Computer Music, published by Prentice Hall, is therefore a great resource for making sense of the phase vocoder engine which I am unable to go into here. Curtis Road's book, The Computer Music Tutorial, published by MIT Press, has sections on the phase vocoder as well; these may better introduce the beginner to the practical concerns of this technology. Short of the explanations these sources provide, I have attempted to offer below some explanations, particularly as needed for control of the parameters in these routines. A manual and tutorial would be great to have; unfortunately time has not yet made it so.

These routines reflect my need for tools which can perform different spectral resynthesis tasks; both simple and experimental. Their refinement has advanced with my growing skills and curiosity, which I expect will continue as long as I have questions about sound. Most of these routines can be viewed in terms of traditional additive or subtractive synthesis tasks, coming about as they did from the desire for greater finesse and control of these two basic types of synthesis. While the speculative nature of some give them an idiosyncractic character, most should, with practice, reveal the transparency of their names if not the role they can play in the shaping of sound. All require a good ear tuned towards sound and idea as none of these routines are automatic, although many hold great potential for the diligent.

This 3.0 release contains only those routines which I think are stable, useful and moderately transparent. Some earlier versions have been omitted, replaced or consolidated into newer routines. For example, compander remains, but the ideas behind bandamp have ripened into spectwarper, a remarkable "super companding" tool for windowing amplitude, and balancing the resonance/noise-residues of a sound. The harmonic tone reorganizer, chordmapper, has continued to grow in its controls (however arcane), offering increasingly subtle ways to reorganize harmonic spectra. The noisefilter routine is now very good, having become a PVC first encounter routine for many whose noisy lives cross my path. Tvfiltdeviator now joins the arcane but novel filtdeviator routine. In addition, I have added a set of feature analysis routines (pitchtracker, centroid, envelope, fluxoid); which should be useful in generating function files to control different synthesis strategies. There are other, more experimental routines (some actually appeared in 2.0) which are still proving themselves; in time they will appear or reappear. As with 2.0, floating-point files (combined with a rescale feature) continue to be readable and writable. Someday I will deal with AIFF headers (although they do not offer floating-point values), but not for now.

Paul Koonce
koonce@music.princeton.edu

RETURN TO INDEX


*** ECMC ANNOTATIONS ***

This version of the PVC html documentation has been edited from Paul Koonce's original PVC 3.0 manual by Allan Schindler to reflect usage of the PVC programs at the Eastman Computer Music Center. Some portions of Koonce's original documentation have been omitted, other portions shortened, edited or reworded, and I have placed some passages, like the paragraph above, in a smaller font. These small font passages can be skipped by ECMC users who are just beginning to use the PVC programs, but may be of interest to more advanced users. Paul Koonce's complete original version 3.0 HTML documentation on the PVC programs is available at http://www.ecmc.rochester.edu/ecmc/docs/pvc/koonce_doc.html and at http://www.cs.princeton.edu/courses/archive/spr99/cs325/koonce.html

Within the ECMC version you are reading, information specific to using these programs at Eastman, like this passage, is printed in green font in the online version. To make these ECMC annotations easier to spot in the printed grayscale version of this document. the section headers of these ECMC-specific annotations are enclosed in asterisks, as in the

*** ECMC ANNOTATIONS ***
header above.

At the ECMC, we are running Koonce's PVCX on wozzeck, the Mac Pro in the MIDI studio, and a port of Koonce's PVC 3.0 programs written by John Gibson on our Linux systems.

The PVC package includes about 20 separate programs, or "routines," briefly described in the

ROUTINES: SHORT DESCRIPTIONS
section of this document, as well as several ancillary function generating and utility programs. The simplest of these programs -- the closest thing in the PVC package to a "basic" phase vocoder application -- is the plainpv "routine," with which you should begin your work with PVC. However, even plainpv is considerably more powerful than most "basic" phase vocoder applications, providing many analysis and resynthesis options. These options offer two advantages:
  • they provide considerable sound modification possibilities, well beyond the basic pitch shifting and time warping features typically available in basic phase vocoder applications; nad
  • they provide many hooks -- parameters (variables) that can be changed if your initial attempts are not satisfactory
The cost, or downside, of these many options can be complexity in usage -- finding your way to the particular parameter you wish to change, or in certain cases, many parameter decisions (some of which you may not understand very well) that must be made.

However, while an initial glance at the PVC scripts may seem daunting, things often are not so bad after all. All of the parameter options are provided with default values. In general, if we accept all of the defaults, the result will be straight resynthesis, and if all goes well the output soundfile should sound identical to the original. In your initial attempts at analysis/resynthesis, you can skip over most of the parameter options, relying upon the default values, and change only those parameters necessary to achieve the desired musical result. Often, the result will be fine. But if not, you then can adjust other parameters in attempting to achieve a more satisfactory result or to eliminate artifacts.
(*** End of this ECMC note ***)

RETURN TO INDEX


ORIGINAL COMMAND LINE FORMAT

All of the executable PVC programs, like plainpv and twarp, originally were designed to be run from a shell window with the standard Unix command line syntax:

routine [flags] input_soundfile output_soundfile
like this:

plainpv -N2048 -P3 input.snd output.snd

Here two flag options are included: a -N (FFT size) value of 2048, and a -P argument of 3 to transpose the pitch up three semitones. Flag options not included on a command line are initialized to default values.

RETURN TO INDEX


INFORMATION PAGE

Information about any routine can be seen by typing the name of the routine without any arguments or file name. Typing:

plainpv

produces the following information about plainpv.

plainpv:  generic phase vocoder with dynamic controls  
plainpv   [flags] [input file (16-bit shorts)] [output file (optional)]
           (values in brackets denote defaults)
       N:      FFT length (must be a power of 2) [1024]
       M:      window size in samples (must be a power of 2) [2*FFT]
                   (0 will automatically set window to 2*FFT size or larger)
       w:      window type: 0 = hamming,  1 = rectangular  
                   2 = Blackman,  3 = Bartlett triangular [0.]
                   4-12 = Kaiser windows for alpha = 4-12,  respectively
                   (representative sidelobe levels for alpha: 
                     4 = -30dB,  8 = -58 dB,  12 = -90 dB)
       D:      analysis frames per second [200]
       I:      time expansion/contraction factor  [1.] 
                 (duration = duration * factor, 1. = original time) 
       P:      pitch transposition in semitones (func) [0]
       a:      frequency shift factor 
                   (bin frequency adder, before -P )(func) [0.] 
       b:      begin time in seconds  [0.] 
       e:      end time in seconds ( 0. = end of file) [0.] 
       C:      resynthesis channel (1 -> ?) (0 = all) [0] 
            SHELF EQ:(post transpose/shift)
       H:      SHELF EQ: Low shelf gain in dB (func) [0.] 
       X:      SHELF EQ: High shelf gain in dB (func) [0.] 
       m:      SHELF EQ: Low shelf frequency in Hz (func) [200.] 
       R:      SHELF EQ: High shelf frequency in Hz (func) [2000.] 
       W:      warp index for reshaping magnitude response (func) [0.] 
                   Values > 0 expand the dynamic range, 
                   values < 0 compress the dynamic range. 
       A:      gain in decibels (func) [0.] 
       l:      envelope attack time  (func) [0.]
       L:      envelope release time   (func) [0.]
       T:      BRICKWALL FILTER TYPE: 0 = bandpass, not 0 = band reject [0]
       f:      frequency window: low boundary  
                   (before -P and -a) (in Hz) [0.] 
       F:      frequency window: high boundary 
                   (before -P and -a)(in Hz) [Nyquist frequency] 
       p:      amplitude reports print mode: 0 = off, 1 = on [0]
       i:      time interval between amplitude reports [.25]
       _:       OUTPUT FORMAT: 0 = taken from input file
                   1 = 16-bit integer, 2 = 32-bit floats [0]
       =:       PEAK RESCALE LEVEL (float output only) 0 to -96 dB 
                   Set to 1 to rescale to level of input file. [ 1 ]
               TERMINAL DISPLAY AND GRAPH FILE OUTPUT
       n:          number of frames  [0]
       u:          low bin frequency  [-1]
       U:          high bin frequency  
                   (-1 = nyquist) [Nyquist frequency]
       S:      TERMINAL DISPLAY: display option  [0]
                 (0 = off,  1 = phase data,  2 = amp data, 3 = both)
       c:      GRAPH FILE: WRITE ascii to FILE
                   0 = off,  1 = freq,  2 = decibels [0]
                   3 = decibels - waterfall plot
                   (When on,  this flag writes ascii point pairs
                    (with time frame on x axis) for plotting 
                     with gnuplot.)
       d:      TERMINAL DISPLAY FILE NAME for -c [./ascii.out]
       t:      oscillator resynthesis threshold in decibels [ -96 ] 

RETURN TO INDEX


CONTROLLING PARAMETERS WITH FUNCTIONS

Parameters which have the word (func) on the info page just before the default as in:

W: warp index for reshaping magnitude response (func) [0.]
can be controlled dynamically. This is done by providing a full pathname file in place of the constant. The file is assumed to be a headerless series of values representing how the parameter will evolve as a function of time. The values may be either 32-bit floating-point values, or ASCII numbers, arranged one-per-line (the routine deciphers which it is). The function file can have any number of values as the series is fitted to the specified duration, linearly interpolated to produce the values inbetween. Function files in 32-bit floating-point form can be created with the CMUSIC gen routines provided with this package.

*** ECMC note: Information on using the function generating routines is provided within the GEN FUNCTION CONTROL OF PARAMETERS section of this document. (*** End of this ECMC note ***)

RETURN TO INDEX


*** USING THE ECMC SCRIPTS ***

As you can readily see from the INFORMATION PAGE output of plainpv above, because these programs have so many options, it is impractical to run them from a command line. Instead, Koonce provides Bourne shell script interface files, and Gibson perl script files, that can be used to run the actual binary programs. Not surprisingly, neither Koonce's nor Gibson's scripts are ideally suited to the way we have set up the ECMC SGI and Linux systems.

Therefore, to simplify usage of many (but by no means all) of the PVC program by ECMC users I have created local ECMC scripts, with the program name followed by the extension .tp, ("template") based upon the shell script models provided by Koonce, that can be used to run these routines. To obtain a list of currently available ECMC templates for PVC programs, type

pvc.tp

To obtain a usage summary on using one of these template scripts, type the script name with no arguments.
Example: Typing
plainpv.tp

will display the following usage summary:

plainpv.tp syntax: plainpv.tp insound [outsound] [> scriptfile]
where "insound" is the name (and, if necessary, path) of the input
soundfile and the optional "outsound" argument is the name of the
output resynthesis soundfile. If the "outsound" argument is omitted
the output soundfile will be named "test."
After capturing this template in an ascii file and editing this
file, run plainpv with this script file with the command:
sh scriptfile
To see a "plainpv" template file without providing soundfile arguments type
plainpv.tp -

As this usage summary explains, if we simply wish to see what a plainpv.tp script file looks like, without providing any arguments, we can type

plainpv.tp -
in a shell window. However, the display will go whizzing by, so we probably will want to pipe the output through a paging program such as less or more in order to display this output one "page" (screenful) at a time:
plainpv.tp - | less

To obtain a script file to run plainpv using the soundfile /sflib/wind/fl.c4 as our input sound and to write the resynthesized output to a soundfile called pvcflutetest1 in our current working soundfile directory, we would type

plainpv.tp /sflib/wind/fl.c4 pvcflutetest1 > scriptfile
where scriptfile is the name we wish to give to this script file. (Most likely, the name we give to this ASCII file would be pvcflutetest1, the same name as the soundfile it creates.)

To obtain a script to run pvanalysis use either the command pvanalysis.tp or else the alias pvcanal.tp. To obtain a script to run twarp, use the command twarp.tp, and so on.

In addition to these .tp script templates, I have created example script files for many (but, again, not for all) of the PVC programs. To obtain a listing of these example files, type

pvcex (or else the alias getpvcex)

To display one or more of these example PVC script files type:
        pvcex   filename(s)   or else    getpvcex   filename(s) 
To display one or more of these example PVC script files through the paging program "less," type:
   pvcex   filename(s) |  less   or else   getpvcex   filename(s) |  less
To capture one or more of these files, type:
  pvcex  filename(s)  >  outfile   or else   getpvcex  filename(s)  >  outfile
where outfile is the name you want to give to this file.
Hardcopy printouts of all of these examples also is available in the ECMC PVC EXAMPLE FILES binder in all three ECMC studios.

Soundfiles in the sflib/x directory exist for all of these examples except for a few that do not create soundfiles, but rather analysis files or some other type of file.

--- --- --- --- --- --- --- ---

LEARNING HOW TO USE THE PVC PROGRAMS AND RUNNING THEM AT EASTMAN

To learn how to use plainpv, the most basic program in the package, or any other PVC for which an ECMC .tp script template exists, I recommend the following steps:

  1. Read the summary description of the program within the ROUTINES: SHORT DESCRIPTIONS section later in this document.
  2. Find out what ECMC example files exist for the program by typing
    pvcex
    and noting the names of example files for the program. In the case of plainpv, there are several examples, and we probably would begin with example plainpv1
  3. Look at one of the example files. To see example plainpv1, for example, type
    pvcex plainpv1

    Study this example and listen to the compiled soundfile in the sflib/x directory that was created by the example:
    psfl plainpv1

  4. Look at, and listen to, other examples created by the program, such as plainpv2 and plainpv3
  5. When you are ready to use the program yourself, obtain a template file for the program. For a usage summary of how to use the ECMC .tp script, type the script name with no arguments:
    plainpv.tp

    Then type
    plainpv.tp insound outsound

    and, if everything looks okay,
    !! > filename

    Alternatively, you can capture the template in a file immediately by typing
    plainpv.tp insound outsound > filename

    This will create a script file with analysis and resynthesis parameters that you can edit and then use to run plainpv.
  6. Next open this script file with a other text editor. Each of these script files consists of a TOP half, which you can edit, and a BOTTOM half, in which you should make no changes (except possibly at the very end, as discussed below) or else the script may not run correctly. Within the top section of the file, change some of the default parameter values to meet your resynthesis goals. Do not change anything within the bottom half of the file (after the "OFFICE USE ONLY" line) except, at the very end, to remove any temporary function files you have created. (Beginning users need not worry about this.) The pound sign # serves as the comment symbol for all PVC scripts, and all characters on a line that follow this symbol are ignored by the PVC programs.
  7. When your script file is ready, run the program with the command
    sh filename

    (Note that PVC scripts must be run by a Bourne shell, with the sh command. The Bourne shell, the oldest type of Unix shell, differs in some ways from the cshell (csh) and tcshell with which most ECMC users are more familiar.)
  8. When the job is completed, play the resulting soundfile. If you aren't completely happy with the result, edit the script file again and run it again.
--- --- --- --- --- --- --- ---

An ECMC help file called pvc summarizes the usage information above and can be consulted for quick reference when using these programs.
(*** End of this ECMC note ***)

RETURN TO INDEX


INPUT and OUTPUT SOUND FILES

*** ECMC Notes: ***
All internal processing in both the SGI and Linux versions of the PVC programs is done in floats. However, there is an important difference between the SGI and Linux distributions of PVC:

  • SGI : In Koonce's SGIPVC distribution, all soundfiles (both input and output) must be in NeXT/Sun 16-bit format, rather than in the AIFF or AIFC formats normally employed on SGI systems. This may seem goofy, and in certain ways it is. The only advantage offered by NeXT format is the option of writing the output samples either as 16 bit integers (the norm) or else, for greater resolution, as 32 bit floats, which then must be rescaled to integers before they can be played. In most cases, however, there will be little audible difference, and ECMC users should use the default 16 bit integer format unless you know what you are doing and don't mind some additional work and complication.

    When using the ECMC .tp scripts on the SGI systems, users are relieved of the chore of converting input soundfiles from AIFF to NeXT format, then converting the NeXT format output soundfiles to AIFF format. The ECMC scripts handle these conversions automatically. Input soundfiles should be in AIFF format. For each PVC job you run, the ECMC script makes a temporary NeXT format copy of the input soundfile called pvcin, which is used by the PVC program. The PVC writes its output to a temporary NeXT format soundfile called pvcout. When the job is completed, the script creates an AIFF format copy of pvcout, and then deletes this temporary NeXT format soundfile. Note that because ALL SGI system PVC jobs create an output soundfile called pvcout, you can only have one PVC job at a time running on the SGIs. Since these programs are processor and disk intensive, it actually would make little sense to try to run two or three PVC jobs simultaneously anyway.

  • LINUX : Gibson's Linux port provides compiling options for outputing either WAVE or AIFF format soundfiles. Soundfiles in either AIFF or WAVE formats can be used as inputs. The ECMC scripts provide a parameter in which you select either AIFF or WAVE output:
    outputformat=AIFF  # for Linux only : specify AIFF or WAVE output format
    Note that the default is AIFF output. However, by changing
    outputformat=AIFF
    to
    outputformat=WAVE
    you can obtain a WAVE format output soundfile, even if the format of the input soundfile is AIFF.
(*** End of this ECMC note ***)

RETURN TO INDEX


PLAYBACK DURING PROCESSING


*** ECMC Notes: ***

Phase vocoder jobs sometimes can take a long time to run. The PVC programs do update the output soundfile headers frequently, so that partially completed output soundfiles can be played before the job has completed. This must be done carefully, however. First you must suspend the PVC job (so that it does not continue to append samples to the soundfile while you are trying to play the soundfile) by typing ^z (control z). Then to play the partially completed output soundfile:

  • SGI systems: Remember that on the SGi systems partially compiled output soundfiles are NeXT rather AIFF format, and have the temporary name pvcout. Type
    play pvcout

    to play this partially completed soundfile.
  • LINUX systems: Partially compiled soundfiles will be in WAVE format, but will have the name you have provided, rather than pvcout.
After playing the partially completed soundfile one or more times, resume compilation by typing % or fg. To kill the job, type ^c (control-c) after resumpting compilation. IMPORTANT: Even if the output is unbearably ugly, do not forget to resume compilation and then kill the job. If you do not resume compilation, the job will remain loaded in RAM.

(*** End of this ECMC note ***)
RETURN TO INDEX


ROUTINES: SHORT DESCRIPTIONS

*** ECMC Notes: ***

Below is a listing of the routines contained in this release along with a description of what each does. These programs are divided here into two groups:

  1. First, those for which an ECMC script exists, and which therefore are ready to use on the ECMC systems.
  2. Next, those for which there currently is no ECMC script. These programs cannot be used as easily on the ECMC systems, and are recommended only for advanced users who don't mind doing some additional work to use these programs. These descriptions are printed here in small font , and may be skipped over by almost all ECMC users.
    (*** End of this ECMC note ***)

    *** (1) PVC Programs for which an ECMC .tp script exists: ***

BASIC ROUTINES

PLAINPV

Plainpv is a basic phase vocoder with control of pitch transposition, frequency shift, time scale, amplitude warp and low/high shelf equalization. It also has some nice controls for looking at the data produced by the phase vocoder.

*** ECMC Notes: *** At Eastman, obtain a shell script template for this routine with plainpv.tp; edit this template, and then run the script with the command: sh scriptfilename
Using the pvcex or getpvcex command, see the following example files :

plainpv1,   plainpv2,   plainpv3, (a mix of examples plainpv3-1, and plainpv3-2),
plainpv4,  plainpv5, and plainpv6 (a mix of examples plainpv6-1, and plainpv6-2)
Example plainpv7 imposes the amplitude envelope of a maraca roll on a gong tone. The maraca roll envelope was created by ECMC PVC example envelope1 Example plainpv8 incorporates a pitch analysis file created by ECMC PVC example pitchtracker1

All example files listed here and below are are available in the hardcopy ECMC PVC EXAMPLE FILES binder in the studios.

RETURN TO INDEX

TWARP

Twarp is like plainpv except that it works from an analysis file rather than a soundfile. This allows you to move forwards/backwards through time according to a time function file.

*** ECMC Notes: *** To use twarp:

  1. You must first run pvanalysis, employing the ECMC script pvanalysis.tp (pvcanal.tp) in order to create an analysis file of some soundfile.
  2. Then, use twarp.tp to get an ECMC template to run twarp.
  3. Edit the file you obtained with twarp.tp, and then
  4. Run twarp with the command sh filename.
The following ECMC example files illustrate various aspects and possibilities of this program:
twarp1,  twarp2 (a mix of example files twarp2-1 and twarp2-2),
twarp3, twarp4 (a mix of example files twarp4-1 and twarp4-2) and twarp5 

Example twarp6 illustrates time point dithering, and is a mix of ECMC examples twarp6-1 , twarp6-2 , twarp6-3 , twarp6-4 and twarp6-5

RETURN TO INDEX

ANALYSIS ONLY

PVANALYSIS

Pvanalysis is the time varying form of freqresponse that creates a phase vocoder analysis for use by other routines. The routines which require pvanalysis files are twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator.

*** ECMC Note: *** At Eastman, use pvanalysis.tp (or else the easier-to-type alias pvcanal.tp) to obtain a template file to run this routine. Edit this file, then type sh filename to create the analysis file. See ECMC example files pvanalysis.voicetest and pvcanal2.

FREQRESPONSE

Freqresponse is a routine used by several others to prepare a spectrum for use with routines that filter, compress or limit. The response can be normalized or not depending on the needs of the routine which will use the response.

*** ECMC Note: ***At Eastman, use freqresponse.tp to obtain a template for freqresponse. After editing this template file, run the analysis by typing: sh filename
See ECMC example files freqresponse1 and freqresponse2

RETURN TO INDEX

AMPLITUDE WARPING

NOISEFILTER

Noisefilter filters out the noise in a sound by subtracting out a frequency response. The frequency response is analyzed from a short segment in the file where noise alone is found. For sounds that do not have segments of isolated noise, there is a threshold mode.

*** ECMC Note: ***At Eastman, run this routine with noisefilter.tp, but, good luck. After an hour or so of testing and fussing with this program, I was unable to come up with any musical results worth listening to or turning into an example file.

COMPANDER

Compander is a classic compressor/expander. What is different here is the use of a peaks response file. The peaks response file is a frequency response, analyzed from a segment of the sound, that is taken to represent the peak bin amplitudes for the sound. Each frequency bin of the peaks frequency response functions as the 0 dB reference point for that frequency bin. The amplitude of the frequency bin is companded relative to this reference.

*** ECMC Note: ***At the ECMC, the entire analysis/companding process can be run with a script file provided by compander.tp. However, currently there are no ECMC example files for this program, which is complicated to use.

RETURN TO INDEX

SPECTWARPER

Spectwarper uses an expanded compansion scheme to highlight either a sound's stronger, resonant components or its weaker noise/residual components. Spectwarper is fairly similiar to compander; however, unlike compander which compands bins against the constant peak of an input response file, spectwarper compands bins using a peak drawn (in the current frame) from a narrow frequency band centered around the value being processed. This causes the compansion or "warping' of the amplitudes to accentuate(expansion) or mask(compression) formants located within the frequency bands; the result being the noise/pitch highlighting mentioned earlier. Part of this comes from the treatment of compression in Spectwaper. Unlike compander which only reduces the amplitude above the threshold when compressing, spectwarper reduces the amplitude of the entire range, becoming, in effect, an expander of the strongest amplitudes that expands them (when the compression level is severe) out of the picture. Spectwarper is one of my favorite routines of late simply because it provides such a simple and powerful control over the noise and pitch characteristics of a sound.

*** ECMC Note: *** ECMC users can obtain a script file to run Spectwarper with spectwarper.tp
Currently, there are no ECMC example files for spectwarper, and this puppy, though powerful, is not so easy to use.

RETURN TO INDEX

BANDAMP

Bandamp is an older PVC program, no longer included in the current PVC distribution. (Its capabilities also can be realized with the newer spectwarper program.) However, I still find bandamp useful, and it is still available on the the ECMC SGI machines. THERE IS NO LINUX VERSION.

This program is an amplitude windowing routine. Like compander, it uses a response file, previously created with , to orient where 0 dB lies for each frequency. Using this reference it gives you a window of amplitudes. While bandamp can be used to select only the stronger amplitudes to produce a result similiar to noise filtering or expansion, its real use is for zeroing in on the weaker amplitudes by attenuating or eliminating the stronger partial frequencies. Setting a window range of -20 to -96 will do this. Wispy violin notes windowed this way will be reduced to their noise in a kind of unvoiced mode. Bandamp is difficult to make sound good, but effective when it does.

To use bandamp on the ECMC SGI systems:

  1. Use freqresponse to create a "frequency response" file that is used by bandamp along with an input soundfile
  2. Use bandamp.tp to obtain a script file to run bandamp.
  3. Edit this script file, then type: sh scriptfile to run the job
See ECMC PVC example file bandamp1 (a mix of example files bandamp1-1 and bandamp1-2) and example file bandamp2. ]

ADDITIVE SYNTHESIS -- HARMONIZER, CHORDMAPPER, AND INHARMONATOR:

These routines all allow for a kind additive synthesis based on the remapping of phase vocoder data according to some model. Each requires an ascii data file specifying how phase vocoder information will be replicated or mapped. This mapping is constant for the run of the routine.

HARMONIZER

Harmonizer works much like a commercial harmonizer in that it allows you to create harmony against the source by adding a transposed copy of it. Here the concept is extended by allowing for multiple harmonizations, each taken from a different band of frequencies, output with seperate gain.

*** ECMC Note: ***At Eastman, run this program with a script initially obtained with the command harmonizer.tp
See also example file harmonizer1 (a mix of examples harmonizer1-1 and harmonizer1-2) and example file harmonizer2 (a mix of examples harmonizer2-1 and harmonizer2-2)

CHORDMAPPER

Chordmapper lets you specify how harmonically related groups of partials will be replicated or mapped to produce chords. An input data file organizes the remapping into tone groups, and includes ways to tune or neutralize the frequency deviations of partials. Time-varying control of these features is available as well. You can use this routine to build up thick chords from single tones, or to delicately reorganize a harmonic spectrum.

*** ECMC Note: ***At Eastman, run this routine with a script file obtained with the chordmapper.tp command. See example files
chordmapper1 (a mix of four source soundfiles created by examples chordmapper1-1, chordmapper1-2, chordmapper1-3 and chordmapper1-4),
example chordmapper2 , and
chordmapper3 , and its four source files: chordmapper3-1 , chordmapper3-2 , chordmapper3-3 and chordmapper3-4 . chordmapper3 and its four sources are very similar to inharmonator1 and its four sources, using slightly different procedures to obtain almost identical results.

AN ECMC help file on chordmapper also is available.

INHARMONATOR

Inharmonator lets you specify how the partials of one fundamental will be remapped or deviated. While the more recent and developed routine chordmapper is probably better for this task, I have decided to leave this routine in for now. (Think chordmapper.)

*** ECMC Note: ***Well, okay, Paul Koonce doesn't seem to have much affection for Inharmonator, but I have found this program useful.

NOTE: AS OF THIS WRITING INHARMONATOR IS BROKEN ON THE ECMC LINUX SYSTEMS AND IS ONLY AVAILABLE ON THE SGI SYSTEMS.
However, many (though not all) of the sound modification procedures provided by inharmonator also can be obtained by using chordmapper.

This routine can be used to alter the ratios between the partial frequencies ("detuning" the sound), and also the amplitude relationships of these partials. Because this can be a powerful program, but complex to use, I have provided an ECMC help file called inharmonator on its usage.

At Eastman, run this program with a script file obtained the with the command inharmonator.tp. See the example files
inharm1 (a mix of example files inharm1-1 -- inharm1-2 -- inharm1-3 and inharm1-4) and
inharm2 (a mix of example files inharm2-1 -- inharm2-2 -- inharm2-3 -- inharm2-4 -- inharm2-5)

RETURN TO INDEX

SUBTRACTIVE SYNTHESIS

CONVOLVER

In its setup and controll, convolver is the very similar to tvfilter. It's processing, however, is different. In tvfilter filtering is produced by multiplying the magnitudes from the polar form of the two analyses; leaving the phases (or frequencies) of the source intact while modifying the amplitudes of those frequencies. Convolver goes a bit further by multiplying the two analyses in their Cartesian forms. This produces an intersection of the two spectra. Unlike tvfilter which produces a shadowlike intersection, shadowing the analysis file characteristic onto the input sound file, convolver creates a true spectral intersection, allowing only that which is common to both sounds to be heard. The effect is a sound which is somewhat garbled as it outputs the more intermittently common spectral components of the two. The form of the multiplication in convolver does not allow some of the filter transposition controls associated with tvfilter. There is however a convolution panpot which offers control of the mix between the convolution and source sounds.

*** ECMC Note: ***At Eastman, use convolver.tp to create a script file to run this routine, and see the example files convolver1 , convolver2 , convolver3 , (a mix of two sources: convolver3-1 convolver3-2) , convolver4 and convolver5


*** (2) PVC Programs for which NO ECMC .tp script exists: ***

Almost all ECMC users can skip the small font descriptions of the following programs, and jump ahead to the FEATURE EXTRACTION section. Because no ECMC .tp utility exists to create script files for the PVC programs that follow, usage of these programs is more difficult for ECMC users: you will have to create your own script files, based upon examples provided by Koonce and located in the the directory
/usr/local/soundapps/PVC/Koonce
on the ECMC Linunx and SGI systems. Nevertheless, I have included Koonce's documentation on these programs here because a few hardy ECMC users might want to take a crack at using one or more of these programs, and they illustrate the depth of the PVC package. Perhaps some day if I have tons of time I will create templates for these programs as well -- but this is not imminent.
(*** End of this ECMC note ***)

FILTER

Filter is a very useful routine for filtering a sound by a frequency response. Filtering is achieved by first creating the frequency response through either synthesis or analysis, followed by filtering with filter. Synthestic responses are created using either chordresponsemaker (which synthesizes a spectrum as a collection of harmonic tones), or filtresponsemaker (which synthesizes a frequency response using lines and breakpoints). Analyzed responses can be made with freqresponse (which analyzes a sound file segment and constructs a response representing the peak or average amplitudes). Once made, the magnitudes of the FFT response are multiplied against the time varying magnitudes of the input sound's FFT. Filter allows time-varying control of the response shape (warp), transposition/shift, compansion, smoothing, and source/filter mix, making this a very useful tool for quickly manipulating the spectral characteristics of a sound according to your synthetic or analytic goals. The synthetic forms can be run with the scripts S.filter_with_chord_synthesis or S.filter_with_breakpoint_synthesis; the analysis-based form with S.filter_with_analysis. The analytic form is a powerful tool for bringing the color of one sound into the realm of another.

CHORDRESPONSEMAKER

Chordresponsemaker is a routine that uses a collection of harmonic tones, variable in size, to create a synthetic frequency response. It is found in various filtering scripts.

FILTRESPONSEMAKER

Filtresponsemaker is a routine that uses breakpoints and straight lines to create a synthetic frequency response. It is found in various filtering scripts.

TVFILTER

Tvfilter is the time-varying (tv) form of filter. Tvfilter uses a pvanalysis file to change the magnitudes of the input sound file. As it is with filter, tvfilter multiplies the magnitudes of the analysis FFT against the magnitudes of the input sound's FFT, while preserving the frequency/phase characteristics of the input sound. Preserving the phase of the input sound file results in a cross-synthesis which sounds like the input sound file covered or suppressed by the shadow of the analysis file. Like filter, tvfilter offers a variety of controls for manipulating the filter characteristic. The use of a phase vocoder analysis to represent the filter characteristic also makes possible the temporal control of the filter file (i.e. backwards/forwards control) as found with twarp. Run this using the script S.tvfilter.

RETURN TO INDEX

RESONANCE/REVERB

RING

Ring uses the phase vocoder to create an all-pass resonator. It works by structuring the FFT resynthesis as a bank of feedback filters that feed back the sinusoid of each bin in a strength proportional to the amplitude of that bin (after adjustment by global feedback controls). This allows the sound to "ring" in a way something like reverb or comb filter resonance. The difference from comb filtering is that with ring spectral resonance is created not through a collection of comb filters selected for their ability to resonate various pulse wave spectra, but rather, through an array of feedback filters (sized by the FFT) that resonate a sine wave spectrum while dynamically tuning their feedback frequencies to the frequencies of the input sound. In short, it creates a kind of "self resonance". Ring is a nice way of increasing the resonant pitch characteristics of a sound, although it has its weaknesses. Ring works best with larger FFT sizes as it is attempting to synthesize or accentuate the more pitched/harmonic characteristics of the sound; this is something larger FFTs, with their increased frequency resolution, handle better. Use of the Kaiser window, with its low sidelobe amplitudes, helps as well. In adition, there is a threshold for preventing the noise features of a sound from being resonated, plus an EQ which can be positioned to filter either the source input to the feedback loop, or the feedback return. Run this using the script S.ring.

RINGFILTER

Ringfilter marries filter with ring by allowing a frequency response to be imposed on the resonance created with ring. Ringfilter begins to look more like multiple-delay, comb filter resonance since the static frequency response selects which frequencies will feed back. What is unique here is that the frequency response can come from an analysis, allowing the input sound to be resonated by the average spectral characteristic of another sound. A synthesized frequency response can be used as well. Like the EQ in ring, the filter in ringfilter can be positioned to either filter the source input to the feedback loop, or the feedback return where it will have the effect of introducing the filter characteristic more slowly through the resulting variable rates of decay. Run ringfilter with S.ringfilter_with_chord_synthesis to create a synthetic frequency repsonse, and with S.ringfilter_with_analysis for an analyzed frequency response.

RINGTVFILTER

Ringtvfilter is to ringfilter what tvfilter is to filter; that is, it makes the filter in ringfilter time-varying. This is a sophisticated idea, that is, time-varying filtering of the resonance of a time-varying sound. The best characterization would be to say that Ringtvfilter imprints the shadow of one sound onto the reverb of another. Ringtvfilter requires some thought and finese in order to separate and articulate the evolutions of the source, resonance, and filter. The best results are created using dynamic, high-profiled source sounds, rich with transient noise; and more constant, pitch/harmonic sounds for the time-varying filter. Like tvfilter, ringtvfilter requires an analysis file. Run this routine using S.ringtvfilter.

RETURN TO INDEX

NONLINEAR FREQUENCY DEVIATION

FILTDEVIATOR

The idea behind filtdeviator is to use a frequency response function to not only filter a sound (as with filter), but to to create a topology of frequency deviation working in correlation with the filter. Consequently, filtdeviator is filter with added parameters for specifying how the filter frequency response function will be mapped into the deviation of frequency. The added parameters set the base and peak deviation for how the response will be mapped into both pitch transposition and frequency shift, and how the function will be warped within the range set by these limits. Their is also a master (0-1) deviation control for globally controlling the deviation. All the controls of filtdeviator allow you to dynamically vary the presence and effect of amplitude filtering and frequency deviation, making filtdeviator an interesting routine for exploring the way filters can be used to impede/transform the resonant signature of a sound. Using small amounts of frequency deviation, with no amplitude filtering, and a sweeping transposition of the filter will produce an effect something akin to the commercial guitar phase shifter; larger amounts of deviation take it into another place entirely. Adding the correlated amplitude filtering conceals the deviation more (positioning it more at the edges of formants), producing a sound something like the floppy resonant behavior of slide whistles. The scripts to run filtdeviator -- S.filtdeviator_with_ chord_synthesis and S.filtdeviator_with_analysis -- are designed with frequency response synthesis/analysis sections like those for filter and ringfilter. Run this routine using either S.filtdeviator_with_analysis or S.filtdeviator_with_chord_synthesis.

TVFILTDEVIATOR

Tvfiltdeviator is to filtdeviator what tvfilter is to filter; i.e. it uses a time-varying filter response in place of the constant one. This routine blows the lid off of what was unusual about tvfiltdeviator. It's great for making wacky sounds out of ones with nice, fixed harmonies. The best use is to use it to deviate itself. Try taking something like a harpsichord or guitar (pitched stuff with decay) and do an analysis of the sound with pvanalysis. Then use the analysis to deviate the same sound. What happens is the strength of each of the sound's components becomes a control over the frequency deviation of that component, one that causes the sound to go "sproing" whenever it has any amplitude. Makes tonal music sound really broken. Run this routine with tvfiltdeviator.

RETURN TO INDEX


FEATURE EXTRACTION

ENVELOPE

Envelope is a routine for tracking the amplitude envelope of a sound. Output can be ASCII, floats or a NeXT soundfile. Selecting floats or ASCII will produce a file suitable for use in the control of a parameter.

*** ECMC Notes: *** At Eastman, obtain a script to run this routine with envelope.tp. See example file envelope1, which is used in example plainpv7

ROUTINES: SHORT DESCRIPTIONS

CENTROID

Centroid is a routine for tracking the centroid of a sound. The centroid is the average of all the frequencies weighted by their amplitudes. It essentially gives you a kind of center frequency value for your spectrum. The analysis can be restricted to a band of frequencies, allowing the centroid to track a particular frequency component (although pitchtracker can do this as well). Selecting floats or ASCII will produce a file suitable for use in the control of a parameter.

*** ECMC Note: *** ECMC users can use centroid.tp to obtain a template script file, edit this file and then use it to run centroid. Currently, there are no ECMC example files for centroid.

FLUXOID

Fluxoid is a routine for tracking the average frequency change of a sound. The average can be weighted (best) or not by the amplitudes. Selecting floats or ASCII will produce a file suitable for use in the control of a parameter.

*** ECMC Note: *** ECMC users can use fluxoid.tp to obtain a template script file, edit this file and then use it to run fluxoid. Currently, there are no ECMC example files for fluxoid.

PITCHTRACKER

Pitchtracker is a routine for tracking the fundamental pitch trajectory of a sound. It is an experimental routine that works, I believe, but forever has its quirks. Three detection methods are available for following the 1) fundamental of the harmonic collection, 2) the strongest formant, or 3) a band-limited centroid. Different output formats let you see, hear and eventually use the fruits of your pitch tracking.

*** ECMC Note: *** ECMC users can use pitchtracker.tp to obtain a template script file, edit this file and then use it to run pitchtracker. See example file pitchtracker1. The analysis output produced by this example is used in example plainpv8

RETURN TO INDEX

CONTROL FUNCTION PROCESSING : RESHAPE

Reshape is a routine for transforming function streams to meet the needs of different parameters. It takes a headerless float or ASCII function file as input and outputs a headerless stream of float or ASCII values. With the appropriate flags, it can be used to limit, resample, translate, warp, expand, shrink, invert, quantize, and lowpass filter the input values. The output can be translated into different amp or pitch units depending on your needs. Run reshape at the command line.

> *** ECMC Note: *** reshape generally is used in a pipe after a gen routine to remap the values created by the gen routine to some new maximum and minimum range. For usage examples, see ECMC example files (e.g. twarp3, twarp4-1, twarp4-2 and inharm2-3.)
(*** End of this ECMC note ***)
##############333

RETURN TO INDEX


TERMS AND COMMON FEATURES

Below are various terms, parameters, or ways of doing things which are common to many of the routines.

OVERLAP/ADD VS. OSCILLATOR BANK METHODS AND RESYNTHESIS THRESHOLDS:

The phase vocoder resynthesizes the signal using one of two methods, depending on the type of changes made to the FFT. If the changes are only to the magnitudes (amplitudes), then the faster overlap/add method is used. If however changes in frequency are made, then the FFT integrity is compromised, necessitating use of the oscillator bank method in which each bin is synthesized as a sine wave changing in frequency and amplitude. This method is slower, although a resynthesis threshold is available which can be used to increase the computation speed by turning off bins whose amplitude falls below the threshold. A threshold of -60dB is appropriate, although safety warrants using a lower threshold if the spectrum is thin and its decays exposed; use your ear.

SOURCE

The source sound is the original input sound. Some routines allow for the mix of the processed sound with the original source sound.

MULTIPLE CHANNELS

All routines allow both monophonic and multi-channel input files to be processed. With multi-channelled files, you can either select one channel and produce a monophonic output file, or process all the channels. Channels are numbered beginning with 1. Processing of multi-channelled files is done one channel at a time beginning with channel 1, with zeros written to channels which have yet to be processed. Prcessing one channel at a time requires less memory and allows you to audition the output sooner than if you did all channels at once.

RETURN TO INDEX

FLOATING-POINT AMPLITUDE RESCALING

Selection of the floating-point, output-file format invokes an amplitude rescaling feature. Once processing is complete, a second pass through the sound file is made to rescale the values to the decibel level specified. A dB rescale level of 1 causes rescaling to the level of the original input file.
*** ECMC Note: *** Most ECMC users will never use the floating point option, and thus will never use this rescaling option, although I have included it near the bottom of the user parameter section of the ECMC .tp script files.

OUTPUT STATISTICS

Two flags are provided for controlling the output amplitude statistics; one turns the statistics on or off, and the other sets how often they will be reported. The statistics provide the peak output level in amplitude and decibels. Wth integer format ouput files, ouput values exceeding the normalized peak amplitude of 1. (0 dB) are clipped to a value of 1.0, and the statistics placed in clip mode; in clip mode reports are made only for frames where clipping occurs. The peak amplitude, its time, and the number of clipped samples are reported at the end of processing. With floating-point format output files, ouput values exceeding the normalized peak amplitude of 1. are not clipped since they will be rescaled in the second pass; output statistics proceed normally throughout. The levels before and after rescaling are reported at the end of processing.

RETURN TO INDEX

FREQUENCY RESPONSE TERMINAL OUTPUT

In many filtering or companding routines, a crude terminal print of the frequency response is a available. A flag sets the high cutoff frequency for this output; a value of 0 (0 Hz) turns printing off.

ANALYSIS FILES

Analysis files are binary, 32-bit floating-point files written by pvanalysis, containing frames of FFT analysis data for one or more channels. Analysis file data is preceeded by a header containing information about the analysis. Analysis files are much larger than the sound files they represent, and increase in proportion to the FFT size used. As such, files can become very large, so it is advisable to only make them when needed unless you have disk space to spare.

RETURN TO INDEX

DECIBELS

Amplitude is always handled in decibel units. The greatest magnitude of the 16-bit short integer is equated with an amplitude of 1.0 or 0 dB. 0 dB functions as unity gain, and the peak amplitude in issues of compression, expansion, and amplitude windowing. A change of +/- 6 dB represents a doubling or halving of the amplitude. Increments of 10 dB are loosely associated with one change in dynamic level. 16-bit shorts allow for a 96 dB dynamic range. Take care not to loose signal level as a consequence of processing since quantization noise will emerge when you attempt to regain your signal level by amplifying the integer sound file.

LOW/HI SHELF EQUALIZATION

Equalization has been provided at various points in routines to allow for the needed adjustment of spectra. The EQ consists of low and hi shelf segments, whose width is adjusted through control of the shelf breakpoint frequency. The region between the shelf segments is represented by a linear decibel gradient between the decibel levels of the two shelves. Some routines implement the EQ before pitch changes, others after. EQ placed before pitch changes (pre-transpose/shift) will cause the EQ to be transposed with the pitch changes, whereas afterwards (post-transpose/shift) will keep them fixed as shifts and transpositions occur.

WARP INDEX

Many of the routines employ the principle of warping in which a distribution of values is transformed by an identity function. In these places an exponential function is employed to remap a 0-1 range of values into a new orientation that preserves the minima (0) and maxima (1) while bringing the distribution closer to either extreme as a result of the curvature of the exponential function selected. The curvature of the exponential function is selected through a warp index. Specifically, warp index w will reorient the input x through the function below (^ = exponentiation).

y = (1. - (e^(x * w))) / (1. - (e^w))

In this function, the warp index of 0 produces a linear function and an untransformed output. Positive warp index values of increasing magnitude produce curves of increasing concavity (increasing slope) that draw values towards the 0-valued minima, and reduce the function integral. Negative values do the opposite, drawing values towards the maxima of 1, increasing the integral.

The practical use of this mechanism is found in various places. One such place is the reshaping of the frequency response distribution characteristics. In this, positive warp indeces cause the peaks of the response to be accentuated while the weaker frequencies are expanded out (i.e. pushed towards 0). Negative values have the opposite effect as they compress the dynamic range of the response and raise the relative level of the weaker noise components. Another place where warp applies is in the remapping of FFT amplitudes through the spectrum warpshape. In this, the sucessive FFT frames have their amplitudes remapped by the identity function, similiarly expanding or compressing the dynamic range depending upon the warp specified; 0 (linear warp function) leaves the amplitudes unchanged.

RETURN TO INDEX

PITCH TRANSPOSITION

With the pitch transposition control, a constant or function value is multiplied against all bin frequncies. This is classic transposition, here specified in semitones of transposition (12 semitones equals an octave). Conversion is made to produce the appropriate frequency multiplier.

FREQUENCY SHIFT

With the frequency shift control, a constant or function value is added to all the bin frequencies to produce a nonlinear pitch domain translation of the spectrum. Frequency shift is related to things like ring modulation and their similarly nonlinear shifts of pitch characteristics. Use this to create small distortions of the harmonic integrity of a sound.

RETURN TO INDEX

ENVELOPE RESPONSE TIME

The rate at which amplitude changes are allowed to occur effects how smooth spectral evolutions will be. To control this, many routines contain attack and decay response times controls: once translated these controls manipulate the coefficients of the following filter.

y(n) = (1. - A) * x(n) + A * y(n)

The filter is a lowpass designed to increasingly smooth the sudden changes in a signal as the value of the coefficient, A, is increased. Its control is through the response time parameter which is the time in seconds it takes a signal, shifting from one state to another, to decay to -60 dB of its former state. Response times are transformed to create the necessary coefficients for the selected frame rate. The response time is separated into attack and decay; this allows seperate control of the smoothing of the signal depending upon whether it is increasing or decreasing in amplitude. Short attack/decay response times can be used in places where dynamic processing induces garble or even pops. You can use longer response times to generally smooth or blur the onset/offset of sound components, particularly if the response controls are being applied to a time-varying filter. When applied to amplitudes, longer decay respsonse-times do not sound good, for in their delay of the decay, they end up amplifying the residual noise of a sound.

RING DECAY TIME

Decay time is an issue in the feedback of the ring routines. Like response time, it is the time it takes the signal to decay to -60dB of its former state, or better, the time it takes the reverb to decay to -60dB.

RETURN TO INDEX

FFT SIZE:

The FFT size must be a power of 2. Larger FFT sizes resolve frequencies better but transient behavior more poorly. Choose your FFT size according to the sound you are working with. A size of 1024 or 2048 works well in most cases.

WINDOW SIZE

The window size is a less opaque parameter; like the FFT, it must be a power of 2. Windows which are twice the size of the FFT work well. Larger window sizes may resolve frequencies better. Specifying 0 for the window size will automatically set the window to twice the FFT size, a feature I have always used.

WINDOW TYPE

The FFT and inverse FFT are computed using a window. Like the FFT size, the shape of the window used can effect the quality of the analysis and resynthesis. (See F.R.Moore, Stieglitz, or Roads for further explanation.) A variety of windows are available including: Hamming, Rectangular, Blackman, Triangular, and Kaiser (in 8 different forms as related to 8 different alpha values). Blackman (-w2) or Kaiser (-w8) are reccomended for most applications. In some unusual cases where transient behavior is being lost, consider using other windows such as the Rectangular, although take care to assure that it is not producing pops or a buzzy sound.

RETURN TO INDEX

FRAMES PER SECOND

This controls how often the phase vocoder will perform an analysis on the signal. It is a translation of the classic decimation control which specifies how many samples to skip between analysis frames. More frames increases the resolution of time but decrease speed. 200 frames per second is a good reference point. If you expand time you should increase this proportionately to maintain about 200 or more frames per second.

TIME EXPANSION/CONTRACTION

Once the spectral modifications are made to the FFT analysis, an inverse FFT is invoked to produce the samples of a time-domain signal. The classic phase vocoder paradigm controls the number of samples through the interpolation value and its relation to the decimation. The arcane relationship of decimation and interpolation is here translated into the parameter of time expansion/contraction, allowing for the direct scaling of time. Use values greater than 1 to expand time, less than 1 contract it.

RETURN TO INDEX

BEGIN/END TIMES

Processing may be performed on an entire file or a segment of it by specifying begin and end times. End times less than or equal to 0 default to the end of the input file.

GAIN:

The output and other components can be gained. 0 dB represents unity gain, no change. See decibels.

FILTERING: SOURCE SIGNAL LEVEL

The mix of source and filtered sounds in the filter routines can be controlled by the source decibels floor. This value, taken from the -96 to 0 dB range, specifies the level of the source signal. The filtered signal level is equal to (1 - source amplitude floor). Consequently, the source level functions as a floor over which lies the filtered signal. A source floor of 0 dB would neutralize filtering since there would be no filter range above the floor, a floor of -96 dB would produce the full effect of the filter.

RETURN TO INDEX

TRANSPOSITION/SHIFT APPLICATION FLAG

Filter routines which allow for transposition and frequency shifting of both filter and source have a flag which specifies whether transposition/shift should be applied before or after filtering. If it is applied before, the pitch transposition trajectory will evolve independent of the filter's trajectory of transposition. If it is applied after, then the pitch transposition trajectory will be added to the filter transposition trajectory, causing the filter to move in parallel with the pitch transposition movements plus any movements the filter transposition parameter adds.

FILTER TYPES: PASS OR REJECT

Filters can be toggled to use frequency responses in pass or rejection mode. In pass mode, the response's stronger magnitudes are used to pass source through the filter; in rejection mode, to impede or reject components. In rejection mode, the response is created by inversion in the decibel range, not amplitude. In time-varying filtering (tvfilter), rejection can be in mode 1 in which the response is inverted against a constant 0 dB peak, or in mode 2 in which the response is inverted against the current analysis frame's peak amp. Spectral warping is always applied after the response has been transformed by rejection.

RETURN TO INDEX

RESPONSE FUNCTION SMOOTHING

Many routines which use frequency response files to filter or warp amplitudes have a control which allows the response to be smoothed. The smoothing is produced by replacing the magnitude of a frequency bin with an average taken from a band centered around that bin. The degree of smoothing is controlled through manipulation of a bandwidth value, specified in octave units. Larger bandwidths produce greater degrees of smoothness, 0 turns smoothing off.

ANALYSIS DATA: ACCESS MODES

Routines which use analysis data made with pvanalysis -- twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator) -- access data the same; using the time-point, rate, and data window boundary parameters, set to function in either rate or explicit mode. In rate mode, the rate determines the speed of movement through a data file; the time-point sets the starting position. The rate may be positive (forward in time) or negative (backwards in time), and vary according to a function. Explicit mode uses the time point parameter to specify exactly where the analysis data should come from (units here are in the time of the analyzed sound). (Explicit mode does not use the rate control, and makes sense only if the time-point is controlled with a function.) Both rate and explicit modes abide by the upper and lower data window boundaries which delimit the data range. When the time-pointer moves beyond the specified upper and lower time boundaries, it re-enters the window from the other end, making the window into a circular/modular structure. The boundaries can be controlled with functions as well, giving this mode an expressive dimension far surpassing the time expansion/contraction parameter. There is also an auto-stop feature that, when turned on, causes processing to stop when it reaches the end of the analysis.

RETURN TO INDEX

CONVOLVER PANPOT

The convolver routine has a unique panpot mechanism for controlling the mix of input sounds (A and B) with their convolution. The panpot is a crossfade mechanism that uses a -1 to 1 control range to accentaute either sound A, B or their convolution. A value of -1 produces an output consisting entirely of sound A, a value of 1, sound B. The 0 between these extremes produces the convolution of A and B. Values between these points produce a crossfade mix of either A or B and the convolution. For example, a trajectory from -1 to 1 would crossfade from sound A into the convolution, and on to sound B. Separate gain controls for A, B and the convolution make it possible to tune the continuity of this trajectory. In addition, the presence or spread of the convolution into the crossfade range can be tuned with the domain warp controls. The domain warp reshapes the movement through the crossfade range, allowing you to create a more gradual approach from A or B into the convolution center. This is achieved through a simple nonlinearizing of the crossfade domain in warp index style. Increasingly positive domain warp values (specified independently for each side) transform the linear trajectory towards the convolution into a decellerating one, causing the subtle mix area around 0 to be expanded. Therefore, if you want to hear more convolution in your crossfade, increase the panpot domain warps.

FREQUENCY RESPONSE ACCUMULATION METHOD

Several of the response-producing routines have the option of accumulating the response by either peak or average means. Whereas peak responses represent the record of a sound's thresholds (or synthesis specification's highest values), average responses represent the most common characteristics. If the sound you are analyzing has intermittent moments of sound whose peak characteristics you wish fully represented in the response, use the peak mode; otherwise use the average.

RETURN TO INDEX

RING ROUTINES: FILTER PLACEMENT

Ringfilter and ringtvfilter (for which there are no ECMC .tp scripts) use frequency response functions to filter the reverb. Two filtering modes are available in which either the source input to the feedback if filtered, or the feedback. When the response is used to filter the source input, it filters the signal before it enters the feedback mechanism, imposing its characteristic, from the start, on the feedback. However, when positioned to filter the feedback component, the appearance of the respsonse's spectral characteristic, in the reverb, appears gradually as the signal decays. In this mode, the time it takes the signal to decay into the response characteristic is controlled by an additional decay time associated with the filter.

COMPRESSION AND EXPANSION

Spectral compression and expansion play a role in many routines. Its implementation here is according to the traditional model that uses thresholds and magnitudes of compression/expansion to reduce or enlarge the dynamic range of a signal. With spectral compression, amplitudes that exceed the specified compression threshold are reduced by an amount determined by the decibels of compression (a multiplier of the bin's amplitude lying above the threshold). Expansion works in a similar fashion, except that it changes the amplitudes below, rather than above, the expansion threshold; this results in an expansion of the dynamic range as the bins falling below the threshold are made to cover a wider range.

The term companding or compander is a merging of the two names, useful in situations where they are both available. While compander is the most obvious example of a routine using companding, traditional compression can be found in several other routines that involve filtering. It is not uncommon, in those routines, to reduce the dynamic range of an analyzed frequency response, particularly if it is time-varying, since the goal in filtering is more about color than dynamic range.

In all routines that use some form of companding, the dynamic range of the unprocessed signal/response is assumed to lie between 0 and -96 dB; thresholds are chosen from within this range. The degree of compression or expansion, expressed in decibels, represents how much the signal lying beyond the threshold will be reduced. A value of -6 dB would halve the dynamic range above the threshold in compression, or double the range below the threshold in expansion.

Compander applies compression for each frequency bin separately rather than as a macro gain change. It does this by using a frequency response file, created with freqresponse, to establish a unique, 0 dB point of reference for every bin; using its unique point of reference, every bin is compressed or expanded.

RETURN TO INDEX


UTILITIES

FILE CONVERSION: aiffs, aiffd, nexts, nextd, nextfloats

The sound file conversion scripts: aiffs, aiffd, nexts, nextd, and nextfloats are shell scripts available for converting sound files back and forth between aiff and next formats, or from next to floats. They are all effectively SGI scripts since they use the SGI sound file format conversion utility, sfconvert. Aiffs and aiffd take next integer files and write new aiff files, nexts and nextd the opposite; in addition aiffs and aiffd can be used to write new aiff integer files converted from next float files. Nextfloats writes a new float file from a next integer file.The s or d following the aiff or next in the name stands for the action taken on the original file once the new file is made; the s saves the original file (i.e. does not delete it), the d causes it to be deleted. Multiple files may be converted with the same run of the command. Running the command without any input files will produce a description of the routine.
*** ECMC Note: *** ECMC users probably will never need to use these file format conversion utilities, since script files provided by the ECMC .tp utilities take care of all necessary format conversions. Most of the file conversion utilities mentioned here do not work on the ECMC Linux systems anyway.

FUNCTION VIEWING: showme, showspect

Two graphing scripts are available for viewing functions and spectral data. You must have gnuplot installed on your computer to use them (Type gnuplot <CR> to see if you do). Showme is a simple script for viewing function files. Run without an input file for a description. Showme takes headerless floating-point or ASCII (give -a flag) function files and plots them. Showspect plots the file of FFT amplitude or frequency data produced by the plainpv script, S.plainpv_with_printout_and_graph_files. Showspect is useful for seeing a graphic representation of a very particular part of an analysis, it is not a substitute for a standard spectrogram application.

RETURN TO INDEX


GEN FUNCTION CONTROL OF PARAMETERS

Any parameter whose flag on the routine's information page has the word (func) after it
[or, within an ECMC template script file, includes the comment # int, float or FUNC]
can be controlled by a function file. To make these files, complete CMUSIC gen command lines are inserted into a script, like this:

gen4 -L1000 0 -3 0 1 3 > $SFDIR/ptrans ;

The file $SFDIR/ptrans created by this sample command contains floating point values representing the trajectory which the pitch transposition should take. The creation of gen routine function files within a PVC script file is used to specify time-varying parameter values.

Such function file definitions may be placed near the top of a script file that runs a PVC program, before the arguments to plainpv or whatever other PVC program is being used, and we may group all of these function file definitions together. Alternatively they can be created within the body of a PVC file, perhaps just before, or even just after, the parameter which they control.

Lines in shells can be continued onto new lines with the backslash, which comes in handy with gen functions. The above, for example, could be entered as:

gen4 -L1000 \
\
0 -3 0 \
\
1 3 \
\
> $SFDIR/ptrans ;
which would simplify our parse of it.
***********************************************************

*** ECMC Notes on creating and using gen functions: ***

Near the top of the ECMC .tp template files I have included commented template lines for creating function tables with CMUSIC gen routines:

   ##### Cmusic function file generator tempates #####
#   gen0  normalizes function files previously created with other gen routines
# gen0 -Llength  max < inputfuncfile > outputfuncfile
#   gen1 creates linear {straight line} segments, like Csound gen 7
# gen1 -Llength t1 v1 ... tN vN
#   gen2 generates harmonic waveforms from sine {a} & cosine {b} amps
# gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM N
#  gen3 generates amp values & linear connections at equally spaced time points
# gen3 -Llength v1 v2 ... vN
#   gen4 generates exponenetial segments; "a" values determine shape &
#  depth of curve: 0 = linear, neg. = exponential, pos. = inverse expo.
# gen4 -Llength t1 v1 a1 ... tN vN
#  gen5 is like Csound gen 9 : harmonic1/amp/phase harmonic2/amp/phase
# gen5 -Llength h1 a1 p1 ... hN aN pN
#     gen6 generates a table of random numbers between +1 and -1
# gen6 -Llength
#    cspline: smooth curve {cubic spline} interpolator
# cspline len_flag [flags] x0 y0 x1 y1 ... xN yN
#   genraw reads in a previously created function file
# genraw -LN filename    (where N is the length of the output function.)
# For a usage summary of "reshape" type  "reshape"  with no arguments.
   ##### End of gen routine function generator tempates #####

Many of the ECMC PVC example files, including plainpv5, twarp1 and harmonizer1, include function table definitions. These function generating routines are similar in several respects to the gen routine in Csound. However, whereas Csound stores function tables in RAM, PVC requires that these tables be written to disk files.

To create a function file, copy the appropriate gen routine template line to a new line, removing the leading # comment symbol, edit the line, and specify a file name for the output. Although these files are fairly small -- typically 1 kb -- I recommend writing them to your $SFDIR ("current working soundfile") directory, rather than to your current Unix directory or to /tmp. The gen routines you are most likely to find useful are those that create time/value envelope shapes: gen1, gen3 and gen4.

A quick tutorial on gen1 through gen5 is provided below. Those who want additional information on CMUSIC gen routines can consult Appendix D in F. Richard Moore's Elements of Computer Music text (on reserve at Sibley for CMP 421-2). Excerpts from this appendix are included as an appendix to the hardcopy ECMC PVC EXAMPLES binder available in the studios.

gen1 creates linear {straight line} segments, like Csound gen 7

     syntax:  gen1 -Llength t1 v1 ... tN vN
Examples: Either of the following two lines would generate an identical result:

     gen1 -L100 0 0   .5 2.5   1. 0  > $SFDIR/updown
     gen1 -L100 0 0   50 2.5   100 0 > $SFDIR/updown
Result : the values ascend linearly from 0 to 2.5 half way through the table, then descend from 2.5 to 0 during the second half of the table

Note: You cannot look at the values within these function tables, since they are in binary format. If you would like to see the values in a table, to make sure you are getting what you want, before you run a job, remove the file redirect at the ends of lines like those above and include an exit command:


     gen1 -L100 0 0   .5 2.5   1. 0 
     exit
This will cause the table values to be displayed in your shell window.
- - - - - - - -

gen2 generates harmonic waveforms from alternating sine {a} & cosine {b} amplitude values. Generally, only sine values are used.
     syntax: gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM N
Example:
     gen2 -L100  1. 0   1/3 0   1/5 0   1/7 0 # > $SFDIR/square
Result: harmonics 1,3,5,7 in sine phase are created with proportions of a square wave
- - - - - - - -

gen3 generates amplitude values and linear connections at equally spaced time points

     syntax:   gen3 -Llength v1  v2 ... vN
Example:
     gen3 -L100 1  .35  1.2  0
Result: Values decrease linearly from 1. to .35 1/3 of the way through the table, then increase from .35 to 1.2 at 2/3 way through the table, then decrease from 1.2 to 0 at the end of the table
- - - - - - - -

gen4 can be a powerful but complicated envelope generating routine to use because one must specify 3 values for each breakpoint except the last, where only 2 arguments are necessary. These arguments for each breakpoint are:

time (t), value (v), and (a), which determines the slope of the curve between this breakpoint and the next.
  • an a value of 0 specifies linear connection
  • negative a values (e.g. -1) specify exponential curves, where most of the change comes near the beginning of the following slope
  • positive a values (e.g. 1) specify inverse exponential curves, where most of the change comes near the end of the following slope
Syntax: gen4 -Llength t1 v1 a1 t2 v2 a2 ... tN vN
Example: gen4 -L50 0 -2. 1 .33 4 1. .67 2.5 -1 1. 0 < $SFDIR/gliassando t1 v1 a1 t2 v2 a2 t3 v3 a3 t4 v4
Result: Values in the table move with an inverse exponential slope from 02 to 4. over the first 1/3 of the table, then from 4. to 2.5 over the second third of the table, then exponentially from 2.5 to 0 over the final third of the table.
- - - - - - - -

gen5 is similar to Csound's gen9, generating harmonic (or, less often, inharmonic) waveforms. The user specifies one or more partials, and for each partial, three arguments: the partial frequency (as a multiplier of a fundamental of 1), it's relative amplitude (on a scale of 0 to 1), and its phase (between 0 and 360 degrees). The resulting table numbers typically have values between +1. and -1.

Syntax:  gen5 -Llength h1 a1 p1 ... hN aN pN
Four examples:
(1)  gen5 -L1000 1   1   0  > $SFDIR/sine  # sine wave
                   h1  a1  p1
Result: One cycle of a sine wave with values between +1. and -1.
(2)  gen5 -L1000 3 1 90  > $SFDIR/harm3
Result: Three cycles (the third harmonic) of a cosine wave (a sine wave with a 90 degree phase shift)

(3) gen5 -L1000 2  1  0    4  .5  0   7  .2   0 > $SFDIR/harm247
                h1  a1 p1  h2  a2 p2  h3  a3  p3

Result: This produces a waveshape that includes harmonic 2 at 100 % amplitude, harmonic 4 at 50 % amplitude and harmonic 7 at 20 % amplitude

(4)  gen5 -L1000 1 1 0  # | reshape -b0 -B1. #   > $SFDIR/tempfunc
Result: A sine wave with values rescaled between 0 and 1.
- - - - - - - -

Below are a few additional sample function generating lines from some of the ECMC PVC example files:

From example file plainpv5:

gen4 -L1000 0 -90 0 \
.1 12 0 \
.8 3 0 1 -90 > $SFDIR/ampfunc

Here, an amplitude function table with 1000 values is created, moving linearly from -90 to +12 over the first 10 % of the table, then to +3 80 % of the way through the table, then to -90 over the last 20 % of the soundfile. This table is written to a file called "ampfunc" in the user's current working soundfile directory.

Two functions from example file plainpv6-1:

gen3 -L1000 5 5 -5 -5 5 > $SFDIR/spectrumfunc
The values remain at 5 during the first 1/4 of the table, then move linearly from +5. to -5. during the second quarter of the table. They remain at -5 during the third quarter of the table, then move linearly from -5. to +5. during the last 1/4 of the table.
gen4 -L1000 0 -2 1 \
.25 -2 1 \
.5 4 1 \
.8 4 1 \
1 2 > $SFDIR/pitchfunc

The floating point values in the file remain at -2. during the first 25 % of the table, then move exponentially to +4. half way through the table, remain at +4. through 80 % of the table, then move exponentially to 2. during the final 20 % of the table.

To avoid cluttering your soundfile directory with these temporary function files, include lines at the very end of a script file (after the OFFICE USE ONLY section) deleting these temporary files, as at the end of example file plainpv5:

rm $SFDIR/ampfunc

Note: If a function definition you create contains an error that makes it impossible for the gen routine to create this function, an error message will be displayed, and the program will ignore this function and use the default values for the parameter(s) where this non-existent function is intended to be used. However, this error message will scroll by quickly near the top of the voluminous diagnostic messages from the program, and it is easy to miss this error. To do a test run of your function definition(s), place an exit command immediately after your function definitions, which will terminate the program at this point, without running the PVC analysis and resynthesis:

gen1 -L1000 0 0 .2 0 3.4 -96 3.68 -96 > $SFDIR/rampdown
gen1 -L1000 0 -96 .2 -96 3.4 0 3.68 0 > $SFDIR/rampup
exit
Run the program. If you get an error message, check your function definition(s) for errors, make an necessary corrections, and run the program again. If you get no error messages, remove the exit line and run the program.

RETURN TO INDEX


SAMPLE OUTPUT FROM PLAINPV

Below is a sample of the output from plainpv.


plainpv -N1024 -M0 -w2 -D400 -I2 -a-0 -P2 -A0 -C0 -t-96 -b0 -e0 -H0 -m200 
-X0 -R2000 -L0 -l0 -W0 -T0 -f-1 -F-1 -_1 -=1 -p1 -i.25  /S1/t.snd /S1/cm.mix.snd 

/////////////////////////////////////////////////////////////////////
---------------------------------------------------------------------

============================== PLAINPV ==============================


---------------------------------------------------------------------

========================== INPUT SOUNDFILE ==========================


INPUT FILE: FILENAME  = /S1/t.snd
INPUT FILE: SAMPLE RATE = 44100
INPUT FILE: NUMBER OF CHANNELS = 2
INPUT FILE: DURATION = 2.770386
INPUT FILE: BEGIN TIME = 0.000000
INPUT FILE: END TIME = 2.770386
INPUT FILE FORMAT: 16-BIT INTEGER

========================== OUTPUT SOUNDFILE =========================


OUTPUT FILE: FILENAME  = /S1/cm.mix.snd
OUTPUT FILE: SAMPLE RATE = 44100
OUTPUT FILE: NUMBER OF CHANNELS = 2
OUTPUT FILE FORMAT: 16-BIT INTEGER
OUTPUT FILE: DURATION = 5.540771

======================== ANALYSIS PARAMETERS ========================


FFT SIZE = 1024
*
      FUNDAMENTAL ANALYSIS FREQUENCY = 43.066406
*
WINDOW SIZE = 2048
FRAMES/SECOND = 400
      DECIMATION SAMPLES (samples between analysis frames) = 110

======================= RESYNTHESIS PARAMETERS ======================


TIME EXPANSION/CONTRACTION FACTOR = 2
*
      INTERPOLATION SAMPLES (samples between resynthesis frames) = 220
*
OSCILLATOR RESYNTHESIS THRESHOLD (in dB) = -96.000000
*
GAIN (in dB) =    0.000
PITCH TRANSPOSITION (in semitones) =    2.000
FREQUENCY SHIFT (in Hz) =    0.000
*
ENVELOPE ATTACK TIME (in seconds) =    0.000
ENVELOPE RELEASE TIME (in seconds) =    0.000
*
SPECTRUM WARPSHAPE INDEX =    0.000
*
FREQUENCY WINDOW: LOW BOUNDARY = 0.000000
FREQUENCY WINDOW: HIGH BOUNDARY = 22050.000000
*
*............. LOW/HIGH SHELF EQ............*
LOW SHELF FREQUENCY =  200.000
.......... LOW SHELF DECIBELS =    0.000
HIGH SHELF FREQUENCY = 2000.000
.......... HIGH SHELF DECIBELS =    0.000
*...........................................*
*
=====================================================================
ANALYSIS: CHANNEL = 1
..............USING BLACKMAN WINDOW
.....USING OSCILLATOR BANK RESYNTHESIS

*********************************************************************
**  PEAK AMPLITUDE STATISTICS **
*********************************************************************
     TIME          PEAKAMP      DECIBELS    (LAST DECIBELS PEAK)
*********************************************************************
(  0.00 -  0.25)    0.0005       -66.295     -66.295
(  0.25 -  0.50)    0.2052       -13.754     -13.754
(  0.50 -  0.75)    0.3285        -9.668      -9.668
(  0.75 -  1.00)    0.3066       -10.269
(  1.00 -  1.25)    0.3176        -9.962
(  1.25 -  1.50)    0.2731       -11.275
(  1.50 -  1.75)    0.2655       -11.518
(  1.75 -  2.00)    0.2416       -12.337
(  2.00 -  2.25)    0.2930       -10.661
(  2.25 -  2.50)    0.2915       -10.707
(  2.50 -  2.75)    0.3067       -10.267
(  2.75 -  3.00)    0.4094        -7.757      -7.757
(  3.00 -  3.25)    0.3076       -10.241
(  3.25 -  3.50)    0.2841       -10.930
(  3.50 -  3.75)    0.2843       -10.924
(  3.75 -  4.00)    0.3241        -9.786
(  4.00 -  4.25)    0.3340        -9.524
(  4.25 -  4.50)    0.3612        -8.845
(  4.50 -  4.75)    0.3113       -10.136
(  4.75 -  5.00)    0.3094       -10.189
(  5.00 -  5.25)    0.3141       -10.058
(  5.25 -  5.50)    0.1142       -18.846

============= PEAK AMPLITUDE ========================================
CHANNEL       TIME          PEAKAMP    DECIBELS    (CLIPPED SAMPLES)
.....................................................................
1            2.898           0.4094      -7.757
*********************************************************************


=====================================================================
ANALYSIS: CHANNEL = 2
..............USING BLACKMAN WINDOW
*********************************************************************
**  PEAK AMPLITUDE STATISTICS **
*********************************************************************
     TIME          PEAKAMP      DECIBELS    (LAST DECIBELS PEAK)
*********************************************************************
(  0.00 -  0.25)    0.0004       -67.948     -67.948
(  0.25 -  0.50)    0.2301       -12.763     -12.763
(  0.50 -  0.75)    0.2477       -12.122     -12.122
(  0.75 -  1.00)    0.1969       -14.115
(  1.00 -  1.25)    0.2631       -11.599     -11.599
(  1.25 -  1.50)    0.2086       -13.613
(  1.50 -  1.75)    0.2559       -11.840
(  1.75 -  2.00)    0.2671       -11.465     -11.465
(  2.00 -  2.25)    0.2768       -11.157     -11.157
(  2.25 -  2.50)    0.1762       -15.082
(  2.50 -  2.75)    0.2113       -13.502
(  2.75 -  3.00)    0.2549       -11.872
(  3.00 -  3.25)    0.2673       -11.460
(  3.25 -  3.50)    0.2869       -10.847     -10.847
(  3.50 -  3.75)    0.2841       -10.931
(  3.75 -  4.00)    0.1991       -14.019
(  4.00 -  4.25)    0.2131       -13.427
(  4.25 -  4.50)    0.2540       -11.904
(  4.50 -  4.75)    0.2235       -13.014
(  4.75 -  5.00)    0.2407       -12.369
(  5.00 -  5.25)    0.2941       -10.629     -10.629
(  5.25 -  5.50)    0.1166       -18.667

============= PEAK AMPLITUDE ========================================
CHANNEL       TIME          PEAKAMP    DECIBELS    (CLIPPED SAMPLES)
.....................................................................
2            5.103           0.2941     -10.629
*********************************************************************


=====================================================================

                 PEAK AMPLITUDES: ALL CHANNELS
---------------------------------------------------------------------
CHANNEL       TIME          PEAKAMP    DECIBELS    (CLIPPED SAMPLES)
.....................................................................
1            2.898           0.4094      -7.757
2            5.103           0.2941     -10.629
=====================================================================

PLAINPV: RESYNTHESIS COMPLETED