ABCp
 the ABC Parser
last update: 09 Dec 2004  



 :: Home
 :: News
 :: Docs
 :: Download

 :: Links

 :: Contacts

Hosted on
SourceForge.net Logo
Remo Dentato '04

Using ABCp

Contents
    1. Introduction
       1.1 Objectives
       1.2 Library interfaces
    2. An Example
       2.1 Main
       2.2 The handler
       2.3 Step by step analysis
    3. States
    4. Tokens

1. Introduction

1.1 Objectives

  This document describe how to use the ABCp library. It referes to the current stage of the software as described in the proof of concept section (that I hope you already read).

  The code may be downloaded from the download page. It is fully ANSI C but has only been tested under Win32.

1.2. Library interface

  Following the suggestions from the abcusers mailing list, I can foresee the existance of three interfaces for ABCp:

  1. An "Object Oriented" interface. Once the file is parsed, a set of objects is created and exposed to the host application to navigate through the musical elements.
    It is applicable when ABCp is used from OO languages (C++, ...) or OO scripting languages (Python, Ruby, Lua, ...)
     
  2. A "Functional API". Much similar to the previous one, when the file is parsed, an internal rapresentation is created and a value is returned as a reference to it. This values will be passed to a set of functions to navigate through it.
    It is applicable with non-OO languages (C,...) and non OO scripting languages (Tcl, Lua, ...)
     
  3. A "C handlers" interface. This operates at a syntactic level. Every time an ABC element is found, a C function (the "handler") is called. This is the lowest level of interface upon which the other two will be built. It is also the only interface implemented in the proof of concept.

  This document describe an example of the "handlers interface", much work should be put into defining a consistent and easy to use functional API and much more work into defining the OO classes.

2. An example

  In the rest of the document we will use as a running example the following request:

Create a program to extract an "index of incipit" from an ABC file.

  Since the objective is to show how to use ABCp, our example only extract the first four bars of each song from the abc file. Should we had an abcRender() function to create graphical rapresentation of the tune, we could easily create a really fancy index! We could also make the number of bars a parameters an many many enancehements.

  Please refer to the file incipit.c in the downloadable archive for the full source code.

2.1 Main

  Here is the main() function:

				
  1 #include "abcp.h"
...
 49 int main(int argc,char *argv[])
 50 {
 51   if (argc < 2) {
 52     fprintf(stderr,"Usage: incipit file.abc\n");
 53     return 1;
 54   }
 55  
 56   cursor=buffer;  buffer[0]='\0';
 57   if (abcScanFile(argv[1],handleincipit)) {
 58     fprintf(stderr,"ERROR! cannot scan %s\n",argv[1]);
 59     return 1;
 60   }
 61   return 0;
 62 }

  As you may notice, the only file to include is "abcp.h". After that you can call the abcScanFile() function that gets two parameters: the name of the file and a pointer to a function to be called each time a match is found.

  Note that also a function abcScanString() is available to scan a text buffer instead of a file. This was suggested in the abcusers list.

2.2 The handler

  This is the function that gets called each token is found:

  3 char buffer[1024];
  4 char *cursor=buffer;
  5
  6 int nbars=0;
  7 int incipitfound;
  8
  9 int handleincipit(abcHandle h)
 10 {
 11   Tokens     t = abcToken(h);
 12   char    *str = abcString(h);
 13   States     s = abcState(h);
 14
 15   if (t == T_FIELDB) {
 16     if (s == (S_FIELD | 'T')) {
 18       strcat(cursor,str);
 19     }
 10     else if (s == (S_FIELD | 'X')) {
 21       cursor=buffer;  buffer[0]='\0';
 22       sprintf(cursor, "%4d - ",atoi(str));
 23       nbars=0;
 25       incipitfound=0;
 26     }
 27   }
 28   else if (t == T_EMPTYLINE || t == T_EOF) {
 29     if (*buffer) printf("%s\n",buffer);
 30     cursor=buffer;  buffer[0]='\0';
 31   }
 32   else if (s == S_TUNE    &&  nbars < 5 &&
 33            t != T_ENDLINE && t != T_STARTLINE &&
 34            t != T_FIELD   && t != T_EXTFIELD ) {
 35       if (! incipitfound) {
 36         strcat(cursor,"\n       ");
 37         while (*cursor) cursor++;
 38       }
 39       else if (t == T_BAR)
 40         nbars++;
 41       strcat(cursor,str);
 42       incipitfound = 1;
 43   } 
 44   while (*cursor) cursor++;
 45   return 0;
 46 }			

2.3 Step by step analysis

  To understand how the handler works, you should have a look at the states and the tokens the abcScanFile() functions returns described in section 3 and 4.

  Let's start analyzing the code:

				
   9 int handleincipit(abcHandle h) 

the handler receive a single parameter: an opaque pointer whose only use is to be passed to the others ABCp functions.



				
 11   Tokens     t = abcToken(h);
 12   char    *str = abcString(h);
 13   States     s = abcState(h);

To get information from the handler we have the following functions:

Tokens abcToken(abcHandle h)
Returns the token found
 
const char *abcTokenName(Tokens n)
Returns a string with the name of the token
(T_FIELD -> "T_FIELD")
 
States abcState(abcHandle h)
Returns the current internal state of the parser. 
 
const char *abcStateName(States s)
Returns a string with the name of the state (S_LIMBO -> "S_LIMBO")
 
const char *abcString(abcHandle h)
Returns the string that matched.
  
const char *abcFilename(abcHandle h)
Returns the name of the file currently parsed.
 
int abcLine(abcHandle h)
int abcColumn(abcHandle h)
Return the number of line and column where the match occured.
 



 15   if (t == T_FIELDB) {
 16     if (s == (S_FIELD | 'T')) {
 18       strcat(cursor,str);
 19     }
 10     else if (s == (S_FIELD | 'X')) {
 21       cursor=buffer;  buffer[0]='\0';
 22       sprintf(cursor, "%4d - ",atoi(str));
 23       nbars=0;
 25       incipitfound=0;
 26     }
 27   }

  At line 15, we check if a "field body" has been found.
  If this happened in the state "S_FIELD_T", we have found the title, if we were in the state "S_FIELD_X", we have found the beginning of a song and we reset all the variables we are using.


  If we have found an empty line (T_EMPTYLINE) or we reached the end of the file (T_EOF), it's time to print the information we have accumulated:

 28   else if (t == T_EMPTYLINE || t == T_EOF) {
 29     if (*buffer) printf("%s\n",buffer);
 30     cursor=buffer;  buffer[0]='\0';
 31   }



 32   else if (s == S_TUNE    &&  nbars < 5 &&
 33            t != T_ENDLINE && t != T_STARTLINE &&
 34            t != T_FIELD   && t != T_EXTFIELD ) {
 35       if (! incipitfound) {
 36         strcat(cursor,"\n       ");
 37         while (*cursor) cursor++;
 38       }
 39       else if (t == T_BAR)
 40         nbars++;
 41       strcat(cursor,str);
 42       incipitfound = 1;
 43   } 

  If we are in the tune (S_TUNE) we collect the symbols we encounter and count the bars. There is something that we have to ignore like the beginning of line (T_STARTLINE) and other tokens.

3. States

  At any time the parser is in one of the following state:

S_LIMBO Before any song or between songs
 
S_FIELD... Into an information field. For example, when an X: field is found, the next state is set to "S_FIELD_X" (that in C has the numeric value S_FIELD | 'X').
In this way when the field body is found (T_FIELDB token) it's easy to understand to which field it refers to.
 
S_INFIELD... Same as before but refers to fields found inside the song. For example when a [K:Cdor] is found.
 
S_EXTFIELD Into an "extended" field. Those are the ones starting with "%%".
 
S_TUNE Inside the real tune. Notes, rests, bars and such are recognized when the parser is into this state.
  

4. Tokens

  Here is the complete list of token returned by the parser to the C handlers.
  The "returned string" or the "matching string" is the string returned by abcString(h).

T_UNKNOWN An unrecognized element (usually a single char is returned as matching string).
 
T_EOF The end of file. No string returned.
 
T_COMMENT
A comment. The entire comment line is returned as matching string.
 
T_EMPTYLINE
An empty line, i.e. a line with only spaces. The spaces (if any) are returned as matching string.  
 
T_EXTFIELD An extended field. The matched string is the name of the field (for example "MIDI" or "staves").
 
T_FIELD An information field (as X: or A:). The returned string include the colon (i.e. "X:" and "A:").
 
T_INFIELD A field inside the tune. Again the returned string include the colon (i.e. "K:" and "V:").
 
T_FIELDB The matched string is the "body" (i.e. the content) of the field previously found. The current state helps in knowing which field it was.
 
T_NOTE The string matched is a note with accident, pitch and duration (e.g. "_E,,2").
 
T_WSPACE Just white spaces.
 
T_BAR
A bar. The matched string is the full group as in ":|]"
 
T_TEXT Some text (generally found in S_LIMBO)
 
T_ENDLINE The end of the line. The matched string contains any trailing space.
 
T_CONTINUE The line ends but the next line should be considered as the continuation of this line.
 
T_DECOR
A decoration. may be something like "T" or "+trill+" or "!trill!".
 
T_REST A rest "z" (with duration)
T_INVREST An invisible rest "x" (with duration)
T_MULTIREST A multi-measure rest "Z" (with duration)
 
T_SPACER The spacer "y"
 
T_NPLET The beginning of a n-plet according to the 2.0 standard draft.
 
T_OPENSLUR T_CLOSESLUR Marks the startpoint and the endpoint of a slur.
 
T_DOTRIGHT T_DOTLEFT Broken rhythm. States if the left or the right note is to be dotted. The string contains as many ">" or "<" as the number of dots.
 
T_CHORDSTART
T_CHORDEND
The start and the endpoint of the chord. All the notes in between should be considered as part of the chord.
 
T_CHORD A "named" chord as "Cmaj7"
 
T_REPEAT A number right after a bar is to be considered an indication for alternatives repeats. Currently the parser does not check if the number is correctly placed near a bar or not.
 
T_GRACEAPP T_GRACEACC
T_GRACEEND
Delimit grace notes. T_GRACEAPP indicates an appoggiatura and T_GRACEACC an acciaccatura.
 
 
T_TIE A tie.
 
T_STRING A string (a text in double quotes that is not an annotation).
 
T_ANNOTATION
An annotation (includes the indication on where to put the text.
 
T_BREAK A line break sign ("!")
 
T_OVERLAY A "&" operator.
 
T_STARTLINE The beginning of the line.
 
T_PRAGMA A line starting with "#". Useful to handle directives like #if or #define as Guido Gonzato did in his abcpp preprocessor.
 
T_LYR_SYLL A syllable in a line of lyrics. ("w:") This includes the dash "-".
 
T_LYR_BLANK A blank that takes the place of a note ("*")
 
T_LYR_BAR A bar in a line of lyrics ("w:")
 
T_LYR_SPACE A group of white spaces in a line of lyrics. ("w:")
T_LYR_CONT
A continuation of the previous syllable ("_")
 
T_LYR_VERSE The number of a verse.