VISUAL PROGRAMMING WITH ICONS

BY
OLIVIER BERNARD CLARISSE

 
Submitted in partial fulfillment of the
requirements for the degree of
Doctorate of Philosophy
in Electrical and Computer Engineering
in the School of Advanced Studies of
Illinois Institute of Technology

Approved ___ S. K. Chang ___
Adviser


Chicago, Illinois
December, 1985

 


 

 

c Copyright by
Olivier Bernard Clarisse
1985

 


 

 
ACKNOWLEDGMENT

          The effort needed to complete this thesis is impossible without the support of many people.

          I wish to thank first Dean A. Vacroux, the Electrical Engineering Department and Dr. Shi Kuo Chang for supporting me, through teaching assistantship and research assistantship during this period of research at IIT.

          I want to thank Dr. Shi Kuo Chang for giving me full time access to one of the lisp machines at the Information Systems Laboratories, and for the constant challenge and support without which this dissertation would not be possible.

          I want to thank Dr. Howard Dreizen for his invaluable advice and for providing important lecture materials. I want to thank Dr. Peter Greene for sharing with us his ideas in artificial intelligence and lisp programming.

          I found no words, only icons, to express my thanks to my wife Diana for her constant support and encouragement.

					  O. C.

 


 

TABLE OF CONTENTS
 
Page ACKNOWLEDGMENT . . . . . . . . . . . . . . . . . . . . iii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . vi ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . x CHAPTER

I. INTRODUCTION . . . . . . . . . . . . . . . . . 1

Visual Interaction . . . . . . . . . . . . . 2 An Icon Points to an Icon . . . . . . . . . 4 The Use of Icons in Man-Machine Interface . 6 Database Navigation Using Icons . . . . . . 6 Visual Languages . . . . . . . . . . . . . . 7 Iconic Systems . . . . . . . . . . . . . . . 7 Icon Management . . . . . . . . . . . . . . 8 Contents . . . . . . . . . . . . . . . . . . 9

II. THE ICON MANAGER . . . . . . . . . . . . . . . 12

Icon Construction . . . . . . . . . . . . . 12 A General Definition of Icon . . . . . . . . 31 Icon Extraction . . . . . . . . . . . . . . 36 The Icon Interpreter . . . . . . . . . . . . 44 LISP Implementation . . . . . . . . . . . . 53 An Interactively Programmable Menu System . 63

III. THE ICON EDITOR . . . . . . . . . . . . . . . 66

Methodology for Icon Editing . . . . . . . . 66 Image Editor and Sketch Matching . . . . . . 69 Icon Structure Editor . . . . . . . . . . . 77 Graph Editor . . . . . . . . . . . . . . . . 87

IV. IMAGE PROCESSING TOOLS OF THE ICON MANAGER . . 89

Halftone Image Representation . . . . . . . 89 A Region Growing and Contour Following Algorithm . . . . . . . . . . . . . . . . 125 Conclusion and Future Research . . . . . . . 163

V. VISUAL PROGRAMMING IN THE ICON WORLD . . . . . 165

A Survey of Visual Programming Techniques . 167 Program Visualization . . . . . . . . . . . 170 Program Visualization Using Video Games . . 172 Visual Programming in The Icon World . . . . 173 Visual Functional Programming . . . . . . . 179 Application to Circuit Design . . . . . . . 192 Conclusion on Visual Icon Management . . . . 197

Page VI. ICON MATCHING AND INFORMATION RETRIEVAL . . . 199

Methodology for Icon Matching . . . . . . . 199 An Algorithm Using Exact Icon Sketch Matching . . . . . . . . . . . . . 201 Icon Matching Measures . . . . . . . . . . . 203 Approximate Sketch Matching: Hardware Limitations . . . . . . . . . . . 211

VII. MULTIMEDIA COMMUNICATION AND ICONS . . . . . 213

Icon Transmission on a Local Ethernet Link . 213 Bitmap Transfer . . . . . . . . . . . . . . 216 Experimental Results . . . . . . . . . . . . 218 Future Work: Application to Multimedia Messages and Teleconferencing . . . . . . 222

VIII. CONCLUSION . . . . . . . . . . . . . . . . . 225

Information Storage and Retrieval in an Icon System . . . . . . . . . . . . . . . 227 GRAPHLOG: Programming in Graph Logic . . . . 228 Quadtree Representation of Images and Pyramids of Microprocessors . . . . . . . 230 Pyramidal Multiprocessor Architecture . . . 234

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . 239

 


 

 
LIST OF FIGURES
Figure Page

1. Design of a Sketch for the Icon MAPEXAMPLE . . 13 2. CONTOURFOLLOWER Applied to Lake2 of MAPEXAMPLE 15 3. REGIONGROWER Applied to the Object Lake2 . . . 16 4. Icon Lake Overlay Superimposed on MAPEXAMPLE . 17 5. Icon Forest Overlay on MAPEXAMPLE . . . . . . . 18 6. REGIONGROWER Applied to a CityQuarter Object . 19 7. The Icon CityQuarter Overlay Completed . . . . 20 8. Construction of the Icon Highway: Contour Follower on MAPEXAMPLE to extract H1 . . . . 21 9. REGIONGROWER Applied to H1 . . . . . . . . . . 22 10. REGIONGROWER Applied to H3 . . . . . . . . . . 23 11. Lake Overlay . . . . . . . . . . . . . . . . . 25 12. Forest Overlay . . . . . . . . . . . . . . . . 26 13. City Quarters Overlay . . . . . . . . . . . . . 27 14. Highway Overlay . . . . . . . . . . . . . . . . 28 15. Display of the Icon MAPEXAMPLE When Completed . 29 16. GRAPHICON Showing the Structure of MAPEXAMPLE . 30 17. READREGIONFROMSCREEN: TTY Interaction Window . 37 18. Reading Back Lake1 from a Display of MAPEXAMPLE 38 19. Icon Lake1Read Superimposed on a Sketch of MAPEXAMPLE . . . . . . . . . . . . . . . . . 39 20. Reading Back a Forest from MAPEXAMPLE . . . . . 40 21. Icon ForestRead2 on MAPEXAMPLE . . . . . . . . 41 22. Reconstruction of the Icon MAPEXAMPLE When READREGIONFROMSCREEN is Completed . . . 42

23. GRAPHICON Showing the Icon Structure Produced by READREGIONFROMSCREEN . . . . . . . . . . . 43

24. Commands Produced by MAKEICONCOMS and Used to Save the Icon MAPEXAMPLE with Icon Structure and Subicons on a File . . . . . . . . . . . 61 25. Saving Icons on Files: File Lengths and Derived Compresion Rates in the Case of MAPEXAMPLE . 62 26. ICONWORLD and ICONEDITOR Menu . . . . . . . . . 83 27. The Dither16Matrix . . . . . . . . . . . . . . 97 28. Following the Order of Dither16Matrix . . . . . 99 29. Display of Four Halftone Fonts (Size 16) . . . 106 30. SHOWARRAY applied to BMARRAY and the BITMAPTABLE . . . . . . . . . . . . . . . . 107 31. Using the Icon PAINTARRAY and the Interlisp-D Sketch Process . . . . . . . 108 32. Visualization of BMARRAY After the Changes . . 109 33. Original Image: Icon DI3 . . . . . . . . . . . 111 34. Halftone Encoding of DI3 with Font Size 2 . . . 112 35. Halftoning with Font Size 16 and Averaging . . 113 36. Halftoning with Font Size 8 and Pixel Skipping 114 37. Timing of the SHOWIMAGE Algorithm . . . . . . . 115 38. Halftoning with Font Size 4 and Pixel Skipping 116 39. Halftoning with Font Size 4 and Averaging . . . 117 40. Halftoning with Font Size 6 and Pixel Skipping 118 41. Halftoning with Font Size 6 and Averaging . . . 119 42. Halftoning with Shifting of Font (Size 6) to the Origin (0 0) and Averaging . . . . . . 120 43. Halftoning with Font Shifting to (2 2) . . . . 121 44. Halftoning with Font Shifting to (3 3) . . . . 122

45. Special Effect with the BITMAPIMAGE Algorithm: Horizontal Line Texture . . . . . . . . . . . 123

46. Special Effect with the BITMAPIMAGE Algorithm: Vertical Line Texture . . . . . . . . . . . . 124

47. Logical North West Gradient of Halftone Image . 126 48. Logical North East Gradient of Halftone Image . 127 49. Logical East Gradient of Halftone Image . . . . 128 50. Logical North Gradient of Halftone Image . . . 129 51. Logical Kirsh Operator (Order 1/2) applied to a Halftone Image . . . . . . . . . . . . . 130 52. Timing of the Four Gradient Operators . . . . . 131 53. The REGIONGROWER Algorithm with Adaptive Cell Division and Creation of a Transparent Icon . . . . . . . . . . . . . 146 54. Timing of REGIONGROWER and ICONMASKADAPTIVE algorithms . . . . . . . . . . . . . . . . . 147 55. Timing Graphs for Comparison of two Region Growing Methods . . . . . . . . . 148 56. Growing an Icon Mask for DI3: Level of Cell Division Equal to 3 . . . . . . . . . . . . . 149 57. Growing an Icon Mask for DI3: Level of Cell Division Equal to 4 . . . . . . . . . . . . . 150 58. Growing an Icon Mask for DI3: Level of Cell Division Equal to 6 . . . . . . . . . . . . . 151 59. Growing an Icon Mask for DI3: Level of Cell Division Equal to 7 . . . . . . . . . . . . . 152 60. Growing an Icon Mask for DI3: Level of Cell Division Equal to 8 (Bit Size) . . . . . . . 153 61. Superimposition of Icon Sketch and Icon Mask . 154 62. Changing the Background Texture of DI3 . . . . 155 63. MAKEICONTRANSPARENT? Applied to DI3: Icons With Text in the Background . . . . . . . . . 156 64. Timing of ICONMASKADAPTIVE Applied to a Large Bitmap (DI3) . . . . . . . 157

65. Timing Graph (Without Swapping Time) as a Function of The Level of Cell Division . . . 158 66. A Binary Tree Graph . . . . . . . . . . . . . . 178 67. Function Browser Applied to THINKJET . . . . . 182 68. Function Browser Applied to GRAPHICON . . . . . 183 69. Function Browser Applied to BITMAPMEASURE and a Graph of the Complete Lambda Expression 188 70. Tree Graph of the Lambda Expression of the Function BITMAPNIL? . . . . . . . . . . . 189 71. Placing Circuit Icons in the Circuit Editor Window . . . . . . . . . . . . . . . . . . . 194 72. Drawing Circuit Connections to Design CLK.FULL.ADD . . . . . . . . . . . . . . . . 195 73. Circuit CLK.FULL.ADD Completed and Used in a Simulation . . . . . . . . . . . . . . . 196 74. Timing Graph for Bitmap Transfer on Local Ethernet Using Method 1 . . . . . . . . . . . 219 75. Timing Graph for Bitmap Transfer on Local Ethernet Using Method 3 . . . . . . . . . . . 220

 


 

ABSTRACT

           

          This dissertation discribes the design approach of an iconic system on a modern lisp machine. The proposed system has applications in image information system, visual programming, computer aided design (CAD) and multimedia communications. Potential applications include computer vision systems, visual languages and iconic expert systems.

         

          A generalized definition of an icon is proposed: a visual representation of an object (physical or abstract) which has relational dependencies with other icons. An experimental iconic system has been designed around an Icon Manager and an Icon Editor systems. The Icon Manager includes facilities for icon creation, icon interpretation, icon exploration, icon saving (on files), and an interactively programmable menu system; it provides basic support to create icons and relate them to other icons. The Icon Editor supports several methods to interactively edit an icon from sketch representations or icon structure representations. The overall system allows the creation and organization of flat pictorial objects of any shape in a two and a half dimensional space.

         

          The image processing tools required by the iconic system include a general technique for halftone transformation of images, a general region growing technique, and methods for progressive transmission of images.

         

          Applications of this iconic programming environment to visual programming, to program design and electronic circuit design (from icon selection and editing), to knowledge systems based on icon graph matching and to multimedia communications are studied in detail. Finally, a possible hardware structure to support an icon management system is proposed which is based on a pyramid of microprocessors architecture.

         

 


 

 

CHAPTER I

INTRODUCTION

 
Following the evolution of electronic design technologies (lamp, transistor, integrated circuits, LSI and VLSI technologies), and the progress in computer softwares, four generations of computers have originated in the past thirty five years. The quest for the fifth generation of computers (computer system and software environment) is now the object of important research work having for goal the integration of sophisticated software tools on top of "non Von Neuman" computer architectures:

1. Integrating various Artificial Intelligence (AI) programming tools into a unified knowledge programming environment.

2. Integrating large arrays of microprocessors into well organized architectures (Dataflow Machines, Geometric Processors, Pyramids).

          A typical knowledge programming environment for the fifth generation computer systems will attempt the integration different programming concepts which have been elaborated from the first Von Neumann machine until the time of "non Von Neumann" networks of microprocessors. More precisely, a user friendly knowledge programming shell will probably house several of the following features:

1. Access to programming languages used in artificial intelligence such as Lisp and Prolog (running on specialized processors).

2. Access to traditional languages such as Ada, C language, Fortran, Pascal (each programming environment being accessible from the others).

3. Programmable adjustment of the computation power to satisfy specialized application needs (operating system with adaptive features to efficiently organize tasks among a large number of microprocessors).

4. Efficient manipulation of pictorial objects of two dimensions (eventually three) allowing extensive visual interaction, visual programming and program animation.

          A machine of the fifth generation is to be organized around a large central processing unit: an architecture grouping several specialized processors and eventually millions of microprocessors.

         

Visual Interaction

          As a contribution to the design and production of computer softwares for the fifth generation of computer systems, we believe that to provide the user with a programming environment which can be designed and modified by visual interaction is a prerequisite step in the process of efficiently transfering "artificial intelligence" to a computer machinery. The integration of visual programming capabilities in computer systems is expected to provide a more efficient programming environment featuring: better (and easier to learn) methods of interaction with the machine, and powerful (visual) tools for software creation, organization and debugging; resulting in more efficient computer software production.

          A possible approach to the creation of a visually programmable computer environment is the design of a computer system able to interpret flat pictorial objects (or three dimensional objects) displayed on a screen (respectively, as a holographic image). Where a typical interpretation of a visual object follows three processing stages:

1. Recursive decomposition of the object in sets of similar components (sub-objects) as function of their geometry, color, texture, relative positions.

2. Creation and storage of a logical model of the object and display of this model (visual).

3. Retrieval and display of one (all) of the internal models which is (are) similar to the model created.

          There exist numerous reasons which motivate the design and development of systems based on visual interaction with a machine. Some are briefly stated here:

1. The human vision system is an extremely powerful interface. It has the largest "bandwidth" of all our "sensors".

2. The human vision system is monitored by a physical model of the outside world which makes visual imagination the dedicated support of real life problem solving (e.g. planning by anticipation of physical actions).

3. Visual models are a possible international media for information exchange.

4. Many modern engineering concepts are best represented by schematics, diagrams, floor plans and layouts which cannot be described and exchanged efficiently in a spoken language "format".

5. Progress in computer graphics and computer architecture have made recent computer systems very suitable for multimedia (and visual) communication.

         

An Icon Points to an Icon

          A common approach has been recently adopted to design visually interactive systems: the use of icons. An icon is a visual symbol used to represent an "object"; physical or abstract (e.g. an action, an idea, a concept). The "object" has become a computer object: a command, an operator, a datatype, and other digital entities which are not physical without their visual representations on the screen. An icon has become a "pointer" to another non physical object: to another icon. Iconic systems, iconographies, rules to process visual symbols which point at other visual symbols are the object of current (and future) research works.

         

While exploring the use of icons and visual symbols, it is interesting to notice that paradoxes which have been illustrated in paintings and lithographs by well known artists of the twentieth century are now becoming part of the "machine". Consider two famous paintings by Magrite: "La Condition Humaine" where a painting is "painted" and placed in front of a window where it takes exactly the place of the part of the landscape it represents (the whole thing being in fact: a painting), "Les Deux Myste`res" where two pipes are represented on the same canvas one on a frame with the words "Ceci n'est pas une pipe" the other suspended in the air, which one is real? Which one is an icon? Other paradoxes have been expressed by Escher's lithographs and are wonderfully presented by Hofstadter (Hofstadter, D. R. 1980). Some represent visually infinite loops and recursions and can be interpreted as computer procedures, other lithographs by Escher have their analogs in images which are generated by (recursive) computer programs: they are surprising (iconic) representations of fractal objects (Mandelbrot, B. 1983).

          What is an icon? An icon points to an icon and can be implemented in computer systems.

          We briefly review in the following, the use of icons in various application domains (further material is provided with references and surveys within each chapter when needed). We then mention a more general definition of icon

which has been the starting point to the icon manager experimental environment described in the next chapters.

         

The Use of Icons in Man-Machine Interface

          An emerging trend in man-machine information system design is the use of icons or visual symbols in representing abstract objects. Since the introduction of icons in the context of computer systems several years ago, icons have appeared as a basis for communication with the object-oriented programming environments derived from Smalltalk (Goldberg, A. 1984), and are now used in simplified versions with many personal computers. The concept of icon has also been advanced by researchers in image processing (Chang, S. K. et al. 1983), computer graphics, and database interface (Herot, C. F. 1980), (Kramlich, D. 1984). Some originally proposed icons as part of the spatial data structure (Tanimoto, S. L. 1976).

         

Database Navigation Using Icons

          Icons have been used as navigational aids in large database systems: for example, in the Spatial Data Management System (SDMS) (Herot, C. F. 1980), (Kramlich, D. 1984), icons have been used to give database objects a pictorial representation. SDMS has demonstrated the efficacy of icon interaction to navigate inside very large pictorial databases (with spatial object dependencies).

         

Visual Languages

          Icon interaction has recently also been used as a new approach to define (graphical) programming languages (Glinert, E. P. et al. 1985), (Glinert, E. P. et al. 1984), (Jacob, R. J. K. 1985), (Brown, G. P. et al. 1985), (London, R. L. et al. 1985), (Reader, G. 1985). A survey of visual programming techniques and their relation to iconic system is presented in Chapter V.

         

Iconic Systems

          In image communications, the definition of image messages also consists of the dual forms (an object form and an image form), (Forsdick, H. C. et al. 1984). In multimedia communication, general structured objects are considered which may use iconic symbols to represent text, sound or images (Reynolds, J. L. et al. 1985). The duality between pictorial representation and logical information is also essential for representation of images. In an image information system, specialized software modules may be dealing with the physical image representation while other software modules are dealing with the logical object representation. Thus, in order for the software modules to deal with images in a uniform way, a dual representation in terms of logical-objects/physical-images is also necessary.

          The notion of a generalized icon was presented earlier (Chang, S. K. et al. 1983). It is an extension of the notion of object icon proposed by Tanimoto (Tanimoto, S. L. 1976). This generalized definition has been used in our first approach and experimentation with iconic systems (Chang, S. K. et al. 1984) and (Clarisse, O. B. et al. 1985).

         

Icon Management

          A new definition of icon which constitutes the basic "datatype" supporting the icon manager, is presented in Chapter II and is simply referred to using the term icon which is different from the usual "icon" of the computer environments presented above.

          In fact, it may seem that some similarities exist between the use of icons in our experimental iconic system and the definitions of object, class, metaclass in object-oriented programming languages such as Smalltalk. This analogy should not be pursued: the system proposed, the Icon Manager, is only an experimental "small" system and does not stand the comparison with Smalltalk.

          The icon manager is based on the creation, by visual interaction, of iconic objects with increasing relational complexity which represent the "knowledge" of the iconic system: it is a test bed for larger iconic systems. We expect iconic objects to represent real world objects as well as computer objects and their relations among each other; to the term object-oriented programming we will prefer the term relation-oriented programming or perhaps, structure-oriented programming to relate to the programming style available with the use of icons.

          We have noticed that with object-oriented programming and more recent approaches to visual (graphical) programming, the spatial dependency among objects was predefined by the system or left to the user with little support; in other applications (SDMS), the spatial organization was directly inherited from the type of object manipulated (geographical database). Therefore we would like to provide the user with various tools to organize icons interactively and to relate the spatial dependency of icons to his own defined rules.

          Contents

          This dissertation presents and describes several basic concepts to design icon management systems. Such iconic systems have direct applications in visual programming, image information system and multimedia communications. Future applications include computer vision systems, visual languages and iconic expert systems.

          The Icon Manager (Chapter II), uses a generalized definition of an icon: an object with a visual representation and relational structures to other icons. The Icon Manager includes facilities for icon creation, icon interprestation, icon exploration, icon saving (on files), and an interactively programmable menu system. It provides basic support to create objects and relate them to other objects.

          The Icon Editor (Chapter III) describes several methods to interactively edit an icon from sketch representations or icon structure graph representations. The overall system allows the creation and organization of flat pictorial objects of any shape in a two and a half dimensional space.

          The implementation of image processing tools required by the iconic system is described in Chapter IV including a general technique for halftone transformation of images, a general region growing technique, and future applications to image transmission.

          Applications of this iconic programming environment to visual programming and the design of visual languages (Chapter V); to knowledge systems based on icon matching (Chapter VI) and multimedia communications (Chapter VII) are also presented. Finally, a possible hardware architecture to support an icon management system is proposed for further research; based on a pyramid of microprocessors architecture (Chapter VIII).

          An implementation of an icon manager system has been written in Interlisp-D and runs on a Xerox 1108 personal workstation. It has been used to experiment with the possibilities of general iconic systems with respect to the visual interaction methods and visual programming methods presented in this dissertation. It is constituted of 150 pages of lisp code and includes the following: an icon manager and editor (file: ICONEDITOR), a set of functions defining the icon structure (file: ICONSTRUCTURE), an icon grapher (file: GRAPHICON), a simplified icon interpreter (file: ICONINTERPR), a recursive menu system (file: RECURSIVEMENU should be associated to a complete icon menu editor), several visual utility and utility functions (files: UTILVISUAL, UTILITIES), image processing utilities (files: NEWBITMAPIMAGE, ICONMASK, REGIONGROWER), a few basic icon transfer functions for an ethernet link (file: ICONTRANSFER), a file transfer protocol and "evalserver" for communication with a PDP11/70 computer system (file: IITNET) and a few demonstration functions were used to construct the examples presented in the text (file: GROWERDEMO ICONICMAPDEMO).

         

 


 

 

CHAPTER II

THE ICON MANAGER

 
Icon Construction

          Before giving our definition of the word "icon", we will survey a first example of icon design to illustrate several important requirements for an icon manager.

          Figure 1 has been created by mouse selection using the wonderful, but unfortunately programmed in "closed style" (not easily transformable for user applications), Sketch process available with the system library of Interlisp-D (Xerox Artificial Intelligence Systems, 1985). Sketch has been used to draw and display inside the window "Viewer onto MAPEXAMPLE" of Figure 1, a set of polylines and spline curves which represent objects of a sample geographical map system. Originally MAPEXAMPLE is only known by its sketch datatype (a regular s-expression) which is displayed in its sketch window by Sketch.

          The command (ICONICMAPDEMO 'MAPEXAMPLE) is executed and a complete icon is being designed with very limited help from the user. Icons of type "Lake" are first defined as follows: the user is prompted to give a name of a lake and to point at the lake contour. CONTOURFOLLOWER is then started and used to extract the object contour from the sketch display of MAPEXAMPLE, at the same time it finds the

 


 


Figure 1. Design of a Sketch for the Icon MAPEXAMPLE

minimum enclosing region of the object (the function MINIMUMENCLOSINGREGION advises the function MULTIGENE, a multidirectional "gene" used to grow the contour cells, see Chapter IV). REGIONGROWER is then started and generates the inside of the object region (the user has to point at the inside of the object with the mouse).

          Figure 2 illustrates the action of CONTOURFOLLOWER on Lake2, and Figure 3 the action of REGIONGROWER. When all the icons of type lake are created, a Lake Overlay can be generated which is superimposed on the sketch of MAPEXAMPLE (Figure 4). Each Lake is in fact a "sub-icon" of "Lakes" (or "Lakes of MAPEXAMPLE") which could be an instance of a more general icon: LAKE. All the lakes are created automatically by filling their interior with the lake texture (TXLAKE). Each lake in the present example is what we will call a "transparent icon": it looks as if it was defined by the interior of its contour, while all icons are in fact rectangular objects (windows).

          The same procedure is automatically applied for the construction of Forest icons and the Forest Overlay is presented in Figure 5. Figure 6 presents the effect of REGIONGROWER when extracting a CityQuarter icon. Figure 7 is the CityQuarter Overlay. Again, each icon in the overlay is a "transparent icon" independent of the other "sub-icons" and of the corresponding "Overlay" icon. Each icon is one connected component of an "Overlay", it can be moved, edited and processed independently if desired. Figures 8, 9 and 10


Figure 2. CONTOURFOLLOWER Applied to Lake2 of MAPEXAMPLE


Figure 3. REGIONGROWER Applied to the Object Lake2


Figure 4. Icon Lake Overlay Superimposed on MAPEXAMPLE


Figure 5. Icon Forest Overlay on MAPEXAMPLE


Figure 6. REGIONGROWER Applied to a CityQuarter Object


Figure 7. The Icon CityQuarter Overlay Completed


Figure 8. Construction of the Icon Highway: Contour Follower Applied on MAPEXAMPLE to extract H1


Figure 9. REGIONGROWER Applied to H1


Figure 10. REGIONGROWER Applied to H3

 


 

illustrate the same mechanism applied to "Highway" icons, it presents a limitation of the use of rectangular textures to fill in long and narrow regions.

          In Figures 11 to 14, the five possible "Overlay" icons are successively displayed independently of the MAPEXAMPLE sketch. The combination of the five overlays with the original MAPEXAMPLE sketch represent the icon MAPEXAMPLE completed (Figure 15). This new icon has several characteristics which distinguish it from traditional pictorial objects (and icons as usually defined). Although it looks like a page from a (black and white) coloring book, the icon MAPEXAMPLE is in fact constituted of sixteen independent pictorial objects which are related by a logical structure, and what is seen in Figure 15 is only one projection onto a flat image plane of the general object. Another projection of this icon is presented in Figure 16 and illustrates the underlying logical structure of MAPEXAMPLE (structure which is used to move and edit the icon from its sketch representation - Chapter III).


Figure 11. Lake Overlay


Figure 12. Forest Overlay


Figure 13. City Quarters Overlay


Figure 14. Highway Overlay


Figure 15. Display of the Icon MAPEXAMPLE When Completed


Figure 16. GRAPHICON Showing the Structure of MAPEXAMPLE

 


 

A General Definition of Icon

          In the context of a computer environment, an icon is a data structure with a non specified number of fields greater or equal to two; two fields are always defined: the "hiddenname" field, a unique reference (pointer) to the icon, and "iconselectfn", a function name or function definition which is evaluated when the icon is "opened" and selected by the user.

          Iconselectfn specifies the action performed when an icon (opened) is selected. If "hiddenname" is not defined, the default field "name" is used instead; "name" is the visible name of the icon: it is displayed when the icon "sketch" is displayed.

          The field "sketch" is usually a storing place for a bitmap image of the icon visual representation. If this field is omitted, "name" (or "hiddenname") is used and printed.

          An open icon is an icon which has been displayed (using "sketch", "name" or "hiddenname") on the graphics screen of the users workstation. It is characterized by the field "window" which holds a window datatype defining the "region" where the icon is located on the screen.

          The "region" field specifies the place of the icon on the screen in absolute coordinates and the icon size using four parameters: (x, y, width, height). When an icon is

opened, "region" specifies the rectangular region of the screen which can be used to select it (user interaction).

          Icon Aspects and Relations. Fields of the icon data structure can be divided in two categories: "aspects" and "relations". The "aspects" fields are used to specify the icon type; icon "aspects" include: icon "name" (any well formed literal can be used), icon "sketch" (any pictorial representation: bitmap, image, sketch boundary, etc.), "region", "window", "iconselectfn", etc.. The "relations" fields (or "relational fields") are of undefined length and hold lists of pointers to other icons. Each field specifies a relation type (e.g. children-of, in-front-of, behind, above, etc.), and a list of other icons involved in the relation named "relatives" of the icon.

          The name of a "relational field" should be the name of the relation itself or a function of this name; it can also be an icon so that a visual representation of the relation is possible (arc, wire, pipe, arrow, etc.). Relational fields are said to constitute the (logical) icon structure.

          Icons and Semantic Network. Each icon can be interpreted as a node in a Semantic Network. In other words, an icon is a node in a general graph where each arc represents a (named) relation to the other nodes (icons). Each icon is an object in itself: it can be displayed as a sketch or using its name printed in a given font type. Each

icon "knows" about all its relations with other icons and conversely.

          While each field is successively being defined and added by user (visual) interaction or a program procedure, the icon becomes a closer representation of a possible real object (physical or abstract) and "knowledge" about this object is "integrated" in the "icon system".

          Icon System, Icon Graph Representation. An icon system is a set of icons interconnected by pointers of their relational fields. The set of all icons defined in the computer environment is often called the "icon world".

          A graph (or subgraph) representation of an icon system is obtained by laying out the icon sketches (or icon names) as "node-labels", using icon names (or "hiddennames") as "node-ids" (identification numbers) and the relations (represented by names or sketches) as "arcs".

          An icon manager system may therefore be defined as a (visual) implementation of a computer environment based upon the graph theory.

          More on Icon Aspects. Other fields are now defined as possible elements of the icon data structure.

          The "sketch" field has associated to it a "sketchhistory" and an "iconmask": "sketchhistory" is a function name or expression used to reconstruct an icon sketch from any sketch type other than a bitmap (e.g. definition from a set of contours such as polylines and/or spline curves, definition from the processing and display of an image file).

          The field "iconmask" is used to create and display a "transparent icon" (defined below) as follows: iconmask is a black mask (bitmap) which defines the inside of an icon sketch (the background of a mask is white). Iconmask is used to give an icon any possible shape (other than the default rectangular shape) by performing too operations: first, transfering the iconmask bitmap at the place (region) of the icon on the screen (logical operation on bitmaps: (AND screenbitmap (INVERT iconmask))); second, transfering the icon sketch on top of the place (region) allocated by the iconmask (logical "or" operation on bitmaps: (OR screenbitmap sketch)).

          The field "mymenucommand" is used to provide the icon with a set of commands (available inside a menu) when it becomes a process.

          Basic Icon Types. Five basic icon types are used in the icon manager and are defined now:

1. Object icon: an icon which has a name or a sketch definition and the default function iconselectfn; when an object icon is selected, the icon interpreter simply evaluates iconselectfn.

2. Transparent icon: an object icon which has an iconmask.

3. Command icon: similar to an object icon except for iconselectfn which is now a function name or expression specific of the command

(it is executed to perform the command action directly on icons selected from the screen).

4. Process icon (or icon process): similar to an object icon but having a menu definition in the field "mymenucommand". A process icon can be used to apply commands on icons selected from the screen.

5. Variable icon: an icon characterized by a name of the form ICONVARS$$$$, where $$$$ is an instance value (integer) and a "sketch" field equal to T (True) which matches any other sketch. A variable icon can only have "relations" to other variable icons. This allows the definition of an abstract icon structure which matches any icon having the same structure (whatever its sketch content is).

          Note: in the rest of the dissertation we refer to the fields of an icon data structure defined above, using the same field names, but written in upper-case.

 


 

Icon Extraction.

          The inverse operation of icon construction may be called icon extraction: when the iconic system is given a bitmap representation of an icon (without any information on the icon logical structure) and attempts to recreate a plausible structure for the object for example by identifying and collecting regions which have similar textures or pictorial patterns.

          An example of icon extraction from MAPEXAMPLE is now presented. The function READREGIONFROMSCREEN is designed to recognize a region of the image plane (bitmap or screen) which has a homogeneous texture. It is used here to "read back" information from the display of Figure 15. Figure 17 presents a "non visual" part of the user interaction with READREGIONFROMSCREEN. In Figure 18, "Lake1" is read back; it is named "Lake1Read" by the user. Figure 19 present Lake1Read superimposed on the original sketch of MAPEXAMPLE: Lake1Read does not fill entirely the original lake with the lake texture because of a present limitation of the algorithm REGIONGROWER (see Chapter IV). Similarly, ForestRead2 is presented on Figures 20 and 21. Figure 22 presents a bitmap display of the icons extracted from Figure 15 (without using the logical information available) superimposed on the icon sketch of MAPEXAMPLE. Figure 23 is an example of icon structure which could be inferred from the information extracted by READREGIONFROMSCREEN and a simple model of geographical maps.


Figure 17. READREGIONFROMSCREEN: TTY Interaction Window


Figure 18. Reading Back Lake1 from a Display of MAPEXAMPLE


Figure 19. Icon Lake1Read Superimposed on a Sketch of MAPEXAMPLE


Figure 20. Reading Back a Forest from MAPEXAMPLE


Figure 21. Icon ForestRead2 on MAPEXAMPLE


Figure 22. Reconstruction of the Icon MAPEXAMPLE When READREGIONFROMSCREEN is Completed


Figure 23. GRAPHICON Showing the Icon Structure Produced by READREGIONFROMSCREEN

 


 

The Icon Interpreter

          The Icon Interpreter is designed to create system commands, and control the icon manager by selection of icon sketches using the mouse, so that almost no command need be typed from the keyboard to interact with the icon manager. However when it is needed, a TTY icon (using an interaction window) can be opened by a process icon.

          The rules defining the icon interpreter can be summarized by the following which only constitutes a guideline:

1. The left button of the mouse is used to attempt "evaluation" of an icon (accessing its function definition).

2. The middle button of the mouse is used to select icons as arguments in the context of a process icon (to move icons on the display screen, to place items inside the window of an interface icon, etc.). The middle button is also used to access the content of an icon or to "extend" an icon, for example, by displaying a menu of options which apply to it or by displaying successive menus in a recursive menu system.

3. The right button of the mouse is used to execute an icon command which originated from a left selection done earlier.

         

The terms "selection with the mouse", "mouse buttoning", "right buttoning", etc. are used in the text to indicate icon selection (and menu selection) using one of the three buttons available on the mouse of a Xerox 1108 workstation for example.

          The term "advising" is used several times in the dissertation and always signifies Lisp-Advising as described in (Xerox corporation, 1983): it is a way of modifying the behavior of a lisp function (usually a compiled definition) without actually modifying the function code. Advising permits interesting patching around pieces of code that the user is not supposed to understand. It is a very efficient way to make the lisp environment do what is desired; it has been largely used, but often results in non recoverable disasters!

         

Interpretation of Icons. What happens when an icon is selected using a mouse button? There are many different ways to describe the effect of icon selection: several parameters are involved in the behavior of an icon. We now give the main directions which we have been following to interpret icons.

          The default icon interpreter which also supports the simple icon editor used in the icon manager (file ICONEDITOR, Chapter III) is only sensitive to "mouse buttoning" with the middle button of the mouse (the middle button accesses the real content of an icon). An icon must be the "current icon process" in order for this type of selection to be effective.

         

Icon Interpretation Under an Icon Process. Currently, an icon process is an icon which has a function definition and a menu of commands displayed and attached to it. This definition can be changed so that a special property (PROCESS) will decide if an icon can effectively become an icon process.

          A default set of commands is available for all icons which do not have private menus of commands defined (special processes). The set of commands available with an icon process is brought up (as a menu) when the frame or the border of the icon are selected using the middle button, that is when the process becomes active. We will explain in a coming section how menus can be interactively constructed and attached to icons.

          Different behavior of icon processes can be obtained by simply replacing the ICONSELECTFN property (icon selection function); this has been used by the command EDITICON which allows an icon to be edited from its current representation by exchanging its ICONSELECTFN with a function able to modify the sketch datatype. Exiting EDITICON undoes the change on ICONSELECTFN.

          The default process is ICONWORLD, which is started whenever the file ICONWORLD (a set of basic icons) is loaded or by executing (ICONWORLD T). Under ICONWORLD a list of commands are made available (Chapter III, Figure 26), including commands to create and displace icons and commands for the icon editor. Each selection (middle button) under an icon process is considered as a new argument for the next command executed under the process. Each icon selected is saved in a list which is passed to the command (a command is executed from a menu by left buttoning preferably). This principle takes care of icon interpretation under an icon process.

         

Interpretation of Transparent Icons. The case of a transparent icon requires a special type of treatment which makes the icon world able to process flat objects of any shape in a consistent way while it is in fact supported by the original window facility of Interlisp-D (which only handles rectangular objects).

          Under the basic icon interpreter (file ICONEDITOR), the middle buttoning of a transparent icon is effective only on the body of the icon. This is absolutely necessary to be able to edit any set of transparent icons which are overlapping and insure that only the desired icon will be affected by the selection. Under the icon interpreter, two icons superimposed can be differentiated by mouse selection as soon as their icon sketches differ by a least one bit on the display screen. This is done within the function MIDDLEOFICON (one of the three functions which constitute the ICONSELECTFN in the context of the ICONEDITOR: LEFTOFICON, MIDDLEOFICON, and RIGHTOFICON).

          MIDDLEOFICON calls the function WHICHINSIDEICONMASK which returns a "which and where" information: "which" is the first icon found (by exploration from the top of the screen to the bottom) which has the mouse cursor inside its body; "where" indicates the abstract location of the cursor on the icon body: it is equal to INSIDE (when the cursor is inside the icon body) and to ONFRAME (when the cursor is on the icon title or the border). The informations "which" and "where" are then used by MIDDLEOFICON to perform icon interpretation as in the case of a regular (rectangular) icon.

          The icon body is the region of the screen where an icon is located, if an icon is transparent, the inside of its body (INSIDE) is limited to the iconmask and the body limit (ONFRAME) is the title region. A title has to be displayed in order for a transparent icon to be accessible as an icon process. WHICHINSIDEICONMASK is a function from the file UTILVISUAL, that explores the set of icons (windows) "covered" by or "hidden" under the topmost icon: the icon originally selected by the mouse action (its ICONSELECTFN has been activated). WHICHINSIDEICONMASK tests for each icon encountered if INSIDEICONMASK? succeeds. INSIDEICONMASK? simply tests if the relative cursor position inside the icon belongs to the icon internal region or to the iconmask bitmap.

          The method used to select and interpret transparent icons is very efficient: even with uncompiled code, the overhead required by WHICHINSIDEICONMASK does not seem to affect the speed of interaction between the user and the "icon world".

          Interaction with Transparent Icons: Text Formatting. Transparent icons could also be used to define TTY interaction icons (windows). This would probably require a modification of the functions operating on (bitmap) display streams (Interlisp) so that the rectangular clipping region test can be replaced by a test performed on the iconmask. A set of default set of iconmasks would be provided so that the user is able to choose between rectangular (default), round, star windows (icons).

          Basically, for a "TTY" icon of any (connected) shape (a transparent icon accepting TTY input/output), the display stream filling operation of inserts characters inside the icon bitmap (accessed from a text file or input from a keyboard) from left to right and automatically wrapping around to remain in the limits of the black pattern of the iconmask.

          This model of "text filling" can be used for general text formatting: several "transparent icons" can be placed in a sequence and the text filling operation is pursued sequentially on each one of them (opening the next one when the current one is full). This will allow horizontal text filling inside objects of any shape.

          Top Level Icon Interpreter. An example of top level icon interpreter can be found in the file (ICONINTERPR), which has been used to prove the equivalence between iconic programming and Lisp programming at the top level of Interlisp-D. An icon must to be created and given the function ICONINTERPRETER as icon selection function (ICONSELECTFN).

          Whenever the ICONINTERPRETER icon is selected, each icon selected thereafter with the left button is interpreted as the first atom of a (command) lisp expression which is constructed progressively entirely by mouse "buttoning". The arguments (icons) for a command are selected with the middle button. Commands can be embedded. The innermost (and last) command defined is executed first by "right buttoning" on the mouse. When all commands have been completed this way, the result from the Lisp evaluation is printed in a TTY icon (window). Of course each command icon can have other iconic side effects and printing the final result in a TTY icon may not be meaningful.

          ICONINTERPRETER is an "early written" function, it is extremely simplified and only occupies one page of lisp code. It should be considered as an elementary top level icon interpreter. Nevertheless, it is sufficient to construct commands from iconic selections.

          Visual representation of iconic commands is also mentioned in Chapter V, where improvements to the icon interpreter are proposed. For the moment we only add that the mechanism used by ICONINTERPRETER to give the user visual feedback on the iconic command he is constructing is the simplest possible: ICONINTERPRETER does not display a new picture (instance) of the icons selected to form the command. The first icon selected has simply its icon sketch "video inverted" and each new argument entered flashes on the screen when selected with the mouse while the command icon flashes also (as if it was "swallowing" the argument icons).

          Command Icons. A command icon is an icon which when activated by simple left buttoning of the mouse becomes video inverted on the screen and is ready to accept arguments selectioned by middle buttoning. It is finally executed when the right button of the mouse is pressed. Such an icon usually has the function COLLECTWINDOWSFOR somewhere inside its ICONSELECTFN definition. Command icons of one argument can be defined using the function COLLECTONEWINDOW instead.

          An example of command icon is the icon THINKJET which has been used to print out the figures presented in the dissertation. The effect of a command icon does not interfere with an eventual icon process so that printing (for example) is possible at any time when an icon is being edited or displaced on the screen.

          A Future "Intelligent" Icon Interpreter. We have imagined the use of three utility functions which will be useful in the future to "intelligently" control the behavior of the icon manager with respect to icon interpretation.

          The DOWHATIMEAN utility already existing in Interlisp-D will have an equivalent with pictorial objects: DOWHATIDRAW. DOWHATIMEAN performs interpretation of user commands according to available (user or system defined) functions so that many mistyped commands can be corrected automatically and executed. DOWHATIMEAN is also used by the lisp interpreter to modify spelling errors inside user program during their first interpretation.

          The DOWHATIDRAW utility is introduced in Chapter III and can be used to retrieve icons (arguments or function definitions) from the environment. It is a major component in a visual programming environment; it is somewhat comparable to reading visual information back from the display screen: an icon is retrieved by an approximate description of its sketch or structure. A complete DOWHATIDRAW utility should be able to interpret a user drawn sketch: the closest existing icon sketch is associated to the sketch drawn and this icon is displayed and/or executed.

          Finally the DRAWWHATIMEAN utility is another concept to guide the icon interpreter to understand and to "retrieve" a pictorial object (icon or command icon). It is specially important in the context of the image editor and the drawing facilities which should become available in all iconic (visual) programming environment.

          For example, DRAWWHATIMEAN could automatically connect by a spline curve, a set of points entered by the user who is making a drawing in aregion of the display screen; DRAWWHATIMEAN could also interpret the movement of a cursor (which represents a bitmap texture)

inside a region of a drawing as a painting operation and automatically filling the region with this texture (a call to REGIONGROWER, Chapter IV).

          In the context of image processing and image editing, many other operations could be initiated from a user a visual action (drawing, painting) rather than by the selection of a predefined icon sketch present on the screen. For example, painting around the contour of an object displayed on the screen can be interpreted as extracting the contour of this object, and the icon interpreter should go ahead and complete the action by applying a default contour extraction function on the image (CONTOURFOLLOWER).

          These three components of a generalized DOWHATIMEAN utility: DOWHATIMEAN (Interlisp-D), DOWHATIDRAW, DRAWWHATIMEAN can constitute the basis of an "intelligent" Icon Interpreter. They should be further studied and developed to be integrated in the icon manager.

          A conclusion may be derived from the last examples: the action of the icon interpreter should not be limited to selection, evaluation and display of icons (icon sketches). The icon interpreter must become sensitive to iconic movements, to icon dynamics as it is already sensitive to cursor movements.

         

LISP Implementation

          We have been implementing various operators to create, manipulate and organize icons. This work has been carried out on an Interlisp-D environment (Xerox 1108 Lisp Machine) (Teitelman, W. et al. 1981) and (Xerox Corporation, 1983) using specialized software packages (multi window package, graphic package, graphic text editor, sketch editor, bitmap editor, etc.) (Xerox Artificial Intelligence Systems, 1985) and an image processing environment under UNIX (DEC, PDP11/70) available from each lisp machine by file transfer protocol on a local Ethernet link. Time-server and eval-server (remote evaluation) protocols are also available from a Xerox 1108 to the UNIX environment. Two Xerox 1108 can also communicate on the same Ethernet link using XNS protocol.

          A set of rules defining the structure and the behavior of generalized icons in our implementation is now given. These rules take care of the definition of the four principal components of a generalized icon: the icon name, the icon sketch (a symbolic representation of its content - physical and logical), the icon structure (logical part, relations to other icons), and the icon content(s) (physical part).

         

1. Any litatom (a literal atom: not a number) in the lisp environment is an icon: it can be NIL and it does not have to be bounded. If the litatom is different from T and NIL, it can have a functional definition, a sketch definition and other (icon) properties described below.

2. An icon is a unique object with several different "aspects" which are stored under property names (SKETCH, SKETCHHISTORY, WINDOW, ICONMASK, CONTENT, MYMENUCOMMANDS, MYMENUWINDOW, etc.). These aspects are used by the icon interpreter to access icons.

3. The "aspects" of an icon can be invoked independently or in various "combinations".

4. An icon has an unlimited number of relations to other icons which define its general (logical) structure. These relations are stored under the property list of the icon as follows: the relation name is the property name (or a substring of the property name) and the property value is the list of icons related to the icon by the relation. They will be called "relatives".

5. Icon properties must only be accessed inside lisp expressions through a special set of functions (GETICONPROP, PUTICONPROP, ADDICONPROP, DELICONPROP) to insure the icon consistency. In the present implementation, some special properties are accessed by special functions such as GETICONSKETCH, SETICONSKETCH, GETICONMASK, SETICONMASK, ATOMWINDOWNAME (access the name of an icon from a window representation).

6. The properties (aspects) SKETCH and SKETCHHISTORY of an icon hold information to construct the icon sketch. When the icon is to be saved on file (or transfered into another environment), the property SKETCH holds the value of a bitmap or a command expression to reconstruct the icon sketch using the

property SKETCHHISTORY.

7. The property WINDOW is only defined when an icon has been opened (ready for display) once and holds a window datatype (Interlisp-D) which is used to display the icon on the screen.

8. Among the "relations" of an icon, two are reserved by the icon manager in the present implementation: PARENTLIST and CHILDLIST. They are special properties representing the default icon structure.

9. The PARENTLIST property of an icon is a list of icons to which the current icon belongs. An icon can be accessed by several of its parents at the same time, but cannot be internally modified by them. An icon is "free" when it has an empty PARENTLIST (it can only be deleted in that case). Icons having T (true) in their PARENTLIST cannot be deleted. This is not implemented any more.

10. The CHILDLIST property of an icon is a list of attribute pairs (ICONNAME, ICONPOSITION) where each icon listed is dependent on the current icon for many operations (MOVE, ATTACH, DELETE, etc.). The position of each icon listed is the relative position of its icon sketch with the current icon. It can be modified under the Icon Editor. This use of icon position has been lately replaced by accessing the region (property REGION) of each icon.

         

Remarks:

1. An icon sketch is usually a user made symbol, indicating what the object contains. If undefined the default could be to display the icon name as a title.

2. In the dissertation we refer to the structure of an icon in terms of logical part, icon relations, icon hierarchy or icon structure.

3. An icon defined by the ten rules above is usually referred to using the term of icon, but the terms object, class or subicon, depending on the context of the discussion may also be used. The term window may be used to refer to an opened icon in the present implementation (the WINDOW property of the icon).

          Operations analog to the Interlisp-D window operations have been redefined to operate consistently on the icon structure (OPENICON, MOVEICON, CLOSEICON, SAVEICON, etc...)

         

 


 

Examples of functions which have been defined to operate on icons are now given, followed by a short description of their use in the current icon manager environment:

 

(MAKEICON     'IconName - - -)   
    create icon IconName.
    (Prompt the user to define or select 
     an icon sketch (bitmap, window, etc.))

          (MAKEINSTANCE 'IconName) create a new instance of IconName.

          (MOVEICON 'IconName) move all or part of the icon IconName, if it is present on the display screen.

          (DELETEICON 'IconName) delete icon IconName Now called DESTROYICON.

          (TOPICON 'IconName) bring the icon sketch of IconName to the top of the display screen so that it can be directly invoked.

          (OPENICON 'IconName - - -) recursively opens all or part of the set of icons present in the structure of IconName.

          (GRAPHICON 'IconName) display all or part of the structure of IconName inside a window attached to the icon sketch.

          (SAVEICON 'IconName) save icon properties in a way which allows to store the icon on a file.

          (RETRIEVE - - - ) retrieve an icon from an incomplete description of it (not yet defined, the icon retriever is mentioned in Chapter III)

 


 

 

Note that "IconName" is the result produced by icon sketch selection; if no icon selection has been made when the command is executed (selected from a menu), a "prompt window" is opened where the icon name can be typed in by the user.

          A command to apply one of these functions to an icon can usually be constructed from selection of icon sketches, or it can be typed in as a regular S-expression inside the TTY-window.

          Saving Icons on Files. The Interlisp-D environment does not provide any support to save windows on files: a window is an internal datatype of the lisp environment and cannot be saved or exchanged. The only objects which can be easily saved are top level values of atoms, property lists (values of properties which are different from internal datatypes) and function definitions. BITMAP and ARRAY datatypes can also be saved in a VARS or UGLYVARS variable declaration.

          In the icon manager, windows are only the datatype used to display an icon on the screen. Saving the window is not necessary and the normal procedure to save an icon is to save all the properties on the property list of the icon (litatom) which have values of the type: ARRAY, ATOM (NUMBER AND LITATOM) , BITMAP or LIST.

         

The function SAVEICON is used to save properties from the icon window (value of the property WINDOW) onto the property list of the icon name. The function MAKEICONCOMS produces a complete list of commands to save an icon with its structure and visual representation on a file. An example is given on Figure 24: all the icons which are on the structure of MAPEXAMPLE will be saved on the same file. Each call to the (nlambda) function COMSFORICONFILE does a saving of the properties of the icon argument on a file: bitmaps are saved (such as sketch and iconmask), lists of icon relatives are saved (so that the icon structure is not lost), regions and numbers are saved and the function definition of an icon is saved on the same file if it does not already belong to another file. Arrays on the property list are not saved but this feature can be easily added.

          Figure 25 lists the directory and file length information of the floppy disk which has been used to save MAPEXAMPLE. The file MAPEXAMPLEICONS has been produced using the functions SAVEICON, MAKEICONCOMS and finally by executing the Interlisp-D command (MAKEFILE 'MAPEXAMPLEICONS). MAPEXAMPLEICONS has all the bitmap images necessary to reconstruct all the subicons of MAPEXAMPLE (Figure 16). For each subicon three bitmaps have been saved: SKETCH, ICONMASK, and CONTOUR. We estimate that only two of them are necessary; This is the reason why (2/3 MAPICONS) has been used to compute the second (inverse) compression rate of Figure 25.


Figure 24. Commands Produced by MAKEICONCOMS and Used to Save the Icon MAPEXAMPLE with Icon Structure and Subicons on a File


Figure 25. Saving Icons on Files: File Lengths and Derived Compresion Rates in the Case of MAPEXAMPLE

 


 

In this particular example, the region information of the subicons have been saved on a different file: MAPREGION (or MAPEXAMPLEREGIONS). The length of MAPREGIONS is then added to (2/3 MAPICONS) to estimate the length of the file necessary to save the complete structure of MAPEXAMPLE. That length is 25 times the length of the file MAPEXAMPLESKETCH which holds a sketch description of MAPEXAMPLE, plus the length of MAPEXAMPLETEXTURES (the five textures which have been used by ICONICMAPDEMO to construct the subicons). Similarly, the first (inverse) compression rate which is equal to 7 represent the information compression from a (flat) bitmap image of MAPEXAMPLE to the sketch representation of MAPEXAMPLE (plus the textures).

          In large geographical databases, icon sketch representation objects can be used similarly for information compression purposes. The logical properties available with the icon manager provide at the same time efficient exploration of the geographical database: database navigation and information retrieval.

          An Interactively Programmable Menu System

          Menus of commands can be interactively constructed using BUILDMENU: any function operating on icons can be added (or deleted) to a list of commands by simply selecting the icon sketch of each function in the order it should appear in the menu. Commands inside a menu may appear as the icon name or the icon sketch.

          Once a menu is formed, it can be saved (SAVEMENU) under the property MYMENUCOMMANDS of an icon and it represents all the possible actions which can be taken when this last icon becomes the current icon process.

          The menu of commands available within the icon editor (Chapter III), for example, can easily be defined using BUILDMENU. Facilities to edit menus (a MENUEDITOR) using mouse selection should be provided to complete BUILDMENU.

         

Recursive Menu System. Icons which are themselves menus can be inserted into other menus.

          This facility is already available to some extent in the Interlisp-D environment but is not accessible to the user. In the file RECURSIVEMENU a simple function is defined which interprets any S-expression as a set of embedded menus as follows: the first atom in a list is the menu title the other atoms are menu items, any list defines a new menu using the same convention. When an item is selected from such a menu using the left button, it is executed. When the selection is made with the middle button, a new menu is brought up if it exists, or the command is also executed.

          What RECURSIVEMENU makes available is an easy way to group commands into (higher level) sets of commands so that the user can use a general command most of the time and still access the exact command by further menu interaction when desired.

         

An Easy to Use (Visual) File System. The function RECURSIVEMENU can be used to give a menu representation of S-expressions and/or icon structures. A particular application is to represent a directory file system: each file or directory is an item in the menu, directories open themselves as sub-menus. Connecting the file system to the menu editor (not yet implemented) provides an easy to use visual interface with the file system to create, copy, move and delete files and directories. This application is left to the reader.

         

 


 

 

CHAPTER III

THE ICON EDITOR

 
Methodology for Icon Editing

          Since an Icon is principally constituted of a sketch, a physical part and a logical part (relational fields), it seems natural to operate on each part independently and then generate a higher level process operating consistently on each one. A similar approach should be used to access, retrieve, display and edit an icon.

          The Icon Editor is a good example to illustrate the possibility to affect, in parallel, several aspects of a same object (an icon) in one operation. Editing an icon sketch is one of the objectives of the Image Editor (which can also be used to edit the content of bitmap images). Editing the logical part (icon relations) is left to the Structure Editor. These two editors plus the structure-oriented Lisp Editor of Interlisp-D and the Text Editor should be merged into one to constitute a General Icon Editor.

         

Introducing the Icon Editor. The simplest Icon Editor is analogous to a line editor in the context of text editing, in the sense that commands are entered sequentially. It has been presented in the literature (Chang, S. K. et al. 1984).

          The command:

          (EDITICON 'ICONNAME)

          allows the user to reorganize the structure embedded under the icon name by using classical commands like move, drop, add, change, replace, to modify an icon tree structure by performing natural operations on the nodes. These operators define the minimum required for a Structure Editor. At the same time, new icon sketches should be generated by the user under (a basic version of) the Image Editor to reflect changes in the icon structure when necessary.

          Although they provided a clear demonstration of what has to be supported by an icon editor, the use of such operators would make rather impractical the transformation of icons.

         

Ideal Interaction with an Icon Editor: A Scenario. Editing one icon is performed by selecting the editor sketch (with the middle button of the mouse). The editor prompts the user to select another icon. A menu is then attached to the new icon sketch (a process has been created to allow icon editing).

          The menu attached by an editor process to an icon can be dependent on the "content" of the icon (or its type); it will bring up a list of all the aspects of the icon which can be edited.

         

In the usual case, the icon is a regular object of the Lisp environment, and the menu displayed (a list of all the modifiable aspects of this object) should be part of: VARS, FILES, FNS, MACRO, RECORDS, PROPERTIES (respectively for variables, files, functions, macros, records and properties). Selecting one of the menu items invokes the Lisp Editor to modify the corresponding lisp expressions; this is partly implemented in the current Interlisp-D environment, but not accessible to the user for changes.

          To access icon contents in a consistent way, some other object types should be added: BITMAP, IMAGEFILE, STRUCTURE, TEXT. Other types should be defined by the user with specifications on the editing method, but there is no clear way of doing this without rewriting the Interlisp structure editor.

          It is possible to define a menu system permitting to edit an icon as, a bitmap, an image, a text, a tree structure, a graph, a form, without having to type a specific lisp command each time. Each time one of these aspects is edited, a particular "branch" of the Icon Editor is selected. An icon can always be edited as a sketch (by invoking the image editor) or as a structure (by invoking the structure editor).

         

A Word on the Actual Implementation. The general icon editor mentioned above is not fully defined in the current system; instead, a set of commands have been written which allows manipulations on bitmaps (icon sketches) and another set of commands allows manipulation of the default icon relations used by the icon manager (CHILDLIST and PARENTLIST). They are described below. These commands have been sufficient for our experimentation and can be easily extended, if necessary, as will be mentioned later. The interactive menu system (Chapter II) allows the definition of menus where some of the commands just mentioned can be gathered by the user (himself) to define his own image and structure editor.

         

Image editor and Sketch Matching

          The image editor is specialized for editing images and sketches. It operates on each picture-related object according to its type: BITMAP, IMAGE (Chapter IV) or SKETCH; but its effect on an icon can be saved to some extent in the SKETCH and SKETCHISTORY properties. A bitmap can be used as a sketch (SKETCH) property or can be the physical content of an icon itself. The SKETCH property can be also a procedure which uses the SKETCHHISTORY information to reconstruct the icon sketch. This is used in MAPEXAMPLE (Figure 1) where a sketch datatype is saved under SKETCHHISTORY and a command invoking SKETCHW.CREATE (Interlisp-D Sketch package) is stored under SKETCH so that Figure 1 is automatically reconstructed when the icon is opened.

          A bitmap in itself is the lowest level of information available from a display system: anything can be transformed into a bitmap, a text, a lisp expression, a structured image.

         

The inverse operation which is the interpretation of a bitmap is highly dependent on knowledge and would require in the future more sophisticated programming tools to be pursued correctly by a computer system.

          When the user watches a bitmap displayed in the iconic environment, he can imagine what it means, what it represents and remember how it was produced, and therefore the user generally knows how to act upon it. However, to allow a similar "backtracking" operation among different knowledge representations in a computer environment, a complete history of the object should be saved to allow a knowledge system to perform this type of object understanding later.

         

The Image Editor. The image editor can be constructed by forming a menu of functions operating on bitmaps, windows, image files and sketch descriptions.

          It can be organized at users' will using functions defined in Chapter IV, and the BITBLT function of Interlisp-D which insures bitmap transfer and logical operations between bitmaps, windows (icons) and display streams. It can also include:

1. The bitmap editor EDITBM which allows the user to modify bit by bit a bitmap inside a window display by adding black and white dots into it.

2. The EDITBITMAP package which is a set of functions to shift bitmaps, invert them, perform some symmetries

along coordinate axes and diagonals or rotate them (90 degrees).

3. The Sketch package (available in the Intermezzo Release of Interlisp-D) which allows the user to draw a sketch using rectangles, circles, ellipses, polylines, spline curves and insert text as well as bitmaps inside it. Sketch has also a built in set of zoom functions.

4. Finally, a set of "cut and paste" type of operators which allow transfer of regions from bitmap (or window) among several pictorial objects. Cut and Paste could be easily extended to sketch objects (3 above) if the lisp code of the sketch package was available.

          The typical "cut and paste" operators of an image editor menu allow the user to MOVE a portion of a window bitmap inside itself (or to DELETE it) and to COPY the content of a window (ALL) or of a region clipped into this window (REGION) to another window. The operation performed between the two bitmaps (SOURCE and DESTINATION) should be selected from the submenu OR, AND, XOR, INVERT (-> INVERT SOURCE, -> INVERT DESTINATION) which relates to the logical operation applied to bits of each bitmaps. Presented in the literature is an example of operations performed by such Image Editor (Clarisse, O. B. et al. 1985).

         

Many functions useful for this type of image (bitmap) editing have been implemented (usually using the Interlisp-D BITBLT function) and can be found in the following files: EDITICON (functions: EDIMAGEREGION, PAINTWITHWINDOWS, etc.), UTILVISUAL, NEWBITMAPIMAGE, REGIONGROWER, ICONMASK and GROWERDEMO.

         

Extension to Transparent Icons. Note that a region (Interlisp-D REGION) is defined to be a rectangular portion of the display screen; this definition could be extended to include combinations of rectangular regions (using "attached" regions and general region decomposition as performed by REGIONGROWER (Chapter IV)) so that a general region can have any shape. For the moment, the use of TRANSPARENT icons has solved the problem of defining objects delimited by non rectangular regions (Chapter II). The image editor operators have yet to be extended to the case of transparent icons where each operation has a double effect: on the icon sketch and the iconmask. Example of such operators can be found in the functions ICONICMAPDEMO and READREGIONFROMSCREEN.

          When unified with the editing of icon structures, the image editor will allow hierarchically organized structures of images, sketches and text to be manipulated in a very elaborate way. This complete set of icon oriented processing tools should find applications in several domains such as photocomposition, text composition, design of customized image processing softwares and the design of sketch combination rules for iconic languages.

         

Icon Retrieval From its Sketch: A Scenario. We will refer to an an icon sketch using the term "sketch" (SKETCH) as well as the term "bitmap" (BITMAP). An icon is said to be "opened" if a window representing its icon sketch has been create (property WINDOW).

          We suppose that the user has already created many icons suitable for his applications. He is now confronted with the problem of retrieving some icons that he created a few days or weeks earlier and which have been closed. We are only concerned with icons for which the user has defined an icon sketch which is representative of the icon content. LISP functions may even have been written to extract icon sketches from pictorial objects in a consistent way. Hence, consistent icon construction is assumed and is left to the user.

          The user has three basic approaches to retrieve an icon defined inside the system:

1. Retrieval from a Name. He can type a command like: (OPENICON 'ApproximateName) and the "Do What I Mean" (DWIM: Interlisp-D) program will attempt to substitute the name of defined atom (ATOM) of the lisp environment which matches with "ApproximateName". This will only work if he has a close enough guess of the icon name and it will usually fail unless DWIM is modified to return a list of possible argument names.

2. Retrieval from a Structure. He can attempt a structure matching of the icon if he remembers that the icon has a particular organization. This approach is further explored in Chapter VI.

3. Retrieval from a Sketch. The user may remember what the icon "looks like" although he completely forgot its name and its structural aspect. He may also have in front of him the icon "KnownIcon" which looks like the icon he is looking for. A command like: (MATCHICONSKETCH 'KnownIcon NIL) which will perform a matching of the sketch of "KnownIcon" with all the sketch bitmaps available in the lisp environment (using the Interlisp function MAPATOMS) can be used to perform icon retrieval from a sketch. It could be the basic function for building a "Do What I Draw" utility (DOWHATIDRAW, Chapter II).

          Although in our experimentation with icon matching (Chapter VI) and icon retrieval we did not test the method 3 above, it can be implemented using a call to the MAPATOMS function while testing the similarity measure of the icon sketch with the icon sketch of each atom of the lisp environment when is exists. Therefore, an icon can really be referred to using only its sketch. The similarity measure can be done using the function MATCHBITMAP defined below.

         

The following rules could be used when defining new icons in an iconic system and will permit more efficient information retrieval than using a call to MAPATOMS with MATCHBITMAP testing (which could take several hours of computation in a large iconic system). The three following rules will make an icon system more similar to an object-oriented environment and therefore, such icon system should probably be implemented directly in Smalltalk.

1. Objects belonging to a same class are linked (at the top level of their parent/children hierarchy) to the same icon (having the name of the class) and have their icon sketches logically constructed using the icon sketch of the class. (In the context of office automation, one may think of a class of text files, a class of image files, a diary, a class of books, folders, papers, letters, a mailing list, a class of printers, etc. ).

2. Two different icons should not have the same sketch unless they are instances of a same icon: they will be differentiated by their internal icon name (the "hidden title" (HIDDENTITLE) or index of the window displaying their icon sketch). Note that if an icon name has already been used, MAKEICON will prevent you from using it again.

3. Efforts are made to give to each new object or object class a sketch representation characteristic of its information content.

         

The efficiency of an algorithm to retrieve icons using their sketch bitmaps is highly dependent on this last rule.

         

To perform the matching of two bitmaps, we defined the function MATCHBITMAP as follows: (BITMAPMEASURE (XORBITMAP BITMAP1 BITMAP2)). Where XORBITMAP (File: GROWERDEMO) performs the logical "xor" operation between the two bitmaps and returns a bitmap as result. To compute a measure of this new bitmap, black pixels are counted by accessing directly its content as bytes in the function BITMAPCOUNT. BITMAPMEASURE then returns the ratio of the bitmap area over the result from BITMAPCOUNT. Sometimes it is only necessary to test if the bitmap is all "white", the function BITMAPNIL? should then be used in place of BITMAPMEASURE. The functions BITMAPCOMP/BIT and BITMAPCOMP/SIZE which allow measure of the similarity of two bitmaps or the comparison of two bitmaps in size have also been defined. They are called for example by UPDATEICONSKETCH to detect if any changes have been done by the user on the icon sketch (BITMAP) and ask him if the new version should be saved or not.

          The functions BITMAPIMAGE, BITMAPMEASURE, BITMAPCOUNT (and TRANSFERICON from Chapter VII) are time consuming operations because bitmaps are accessed or created sequentially any other functions which is implemented using BITBLT accesses a bitmap much more efficiently (e.g. XORBITMAP).

         

Icon Structure Editor

          Structure Modifications from Sketches. We now describe a set of commands designed to perform structure modifications from a presentation of icon sketches.

          To modify the structure of an icon so that icon selection and displacement reflect changes inside the CHILDLIST and PARENTLIST properties icons, the window package of Interlisp-D is insufficient; it only allows the user to bring a window to the top (TOTOPW) of the screen or to bury (BURYW) it under all the other windows. Although it is theoretically possible to organize a "pile" of windows in any order using these two operators (TOTOPW and BURYW) by doing manipulations as if they were in a memory stack, it is rather inconvenient to use. Note that the Intermezzo version of Interlisp-D also allows the manipulation of "attached windows" which is a way to "glue" together sets of windows (basically when a window is attached to another, all the usual operators on windows are applied to both).

         

The following defines a set of operators which should be available in the Icon Editor menu to edit the structure of icons from icon sketch selection and displacement:  

          (AFTER ATTACH BEFORE CANCEL DELETE DETACH REPLACE SWAP SWITCH UNDO )

          (COPY EDIT EVAL MOVE )

 
These operators are consistent with the commands of the structure oriented Lisp Editor in Interlisp-D; when lisp expressions are interpreted as tree graphs (Chapter V).

          The first group of operators uses two arguments or more, the second group is constituted of unary operators.

          For each operator of the first group, each icon selected is "pushed on a stack" (a list), the first icon selected becomes the "target" and the others are "objects". When an icon sketch is first selected, it can be easily surrounded by a dark border on the display screen (by changing the BORDER property of the window which is used for its display). When more icon sketches are selected, the first selection becomes a "target" (it is the center point for the changes) its iconsketch can be surrounded by a larger dark border. If a selection is not right, the command CANCEL can be used from mouse selection.

         

Each binary (n-ary) operator can be interpreted in terms of "movements" of a set of "objects" (in the graph representation of the icon) with respect to the "target". Note that each time a target is moved, the whole set of icon sketches linked to it by the CHILDLIST relation (respectively PARENTLIST) follows the same "movement" as this target or object.

         

AFTER (BEFORE) place the objects after (before) the target in the branch of a directed graph representing the structure of the whole icon edited where the target is. AFTER (BEFORE) move the icon sketches of the objects underneath the icon sketch of the target without changing their spatial positions in the display plane, but updating their parent/children relationship.

          ATTACH (DETACH) provide some tools to glue (unglue) objects around a target. It will prompt you to give more information using: "where to glue what?" which can be answered using the displacement of the mouse. The result is a composite icon-sketch (in the same plane, on the display screen) which implies some icon-sketch position dependencies: it translates in calls to the ATTACH-WINDOW functions and is stored under the property ATTACHEDTO of icons.

          REPLACE replace a target by a set of objects, creating new parent/children dependencies. Icon-sketches are all moved to the plane of the target, but their spatial positions are unchanged.

          DELETE delete the objects, not the target (to delete one object at a time, select it once if no other selection has been made since last CANCEL, or select it twice if other objects are already selected).

          SWAP exchange objects and target, produces many targets and one object; if any operator is following, it will be applied successively to each pair of the form (target, object).

          SWITCH should be used as a binary operator: it will attempt to physically interchange two icons as if they were hypernodes in a directed graph.

          UNDO will undo part (or all) of what has been done, like the structure oriented editor of Interlisp-D.

         

The unary operators (second group) should only be applied on the last "object" selected.

          COPY copy the selected icon (represented by a set of icon sketches) onto a buffer window.

          EDIT create a new editor process applied to the icon selected.

          EVAL attempt the evaluation of icon selected (useful for debugging inside an editor session or to explicitly invoke a process). Note that EVAL could have a broader meaning than in the usual lisp environment: for example, evaluating an IMAGE icon could generate a bitmap image of its content, evaluating a TEXT icon could open a window (display stream) where the content of the file can be examined (scrolled up or down).

          MOVE move icon sketches inside the display screen; it allows the user to modify the positions of any icon sketch belonging to (i.e. element of the structure of) the icon selected (see, editing icon sketch positions).

         

Apply to Icon Structure: APPLICON. Most of the above operators can be programmed using calls to the general function APPLICON: "apply recursively to icon". APPLICON applies recursively (depth first) an operator (e.g. RELMOVEW, TOPW, CLOSEW, etc.) along the icon structure. It explores the default property CHILDLIST of either the window datatype associated to the icon (property WINDOW) or the icon itself. When editing an icon structure, only the window properties are modified, these changes are reported on the icon properties each time a SAVEICON is performed (Chapter II). APPLICON is not limited to the exploration of (icon) tree structure: it can explore any graph defined in terms of CHILDLIST relations, in particular, recursive graphs are explored so that each node is parsed once and only once. Being able to explore graphs with recursive links (loops) is particularly important for visual functional programming (Chapter V).

          For some operators, the depth first exploration of the icon structure is not desired (e.g. MOVEICON) so breadth first exploration is used. APPLICON can be extended to explore other icon relations as well and the code for a better version of APPLICON can be found in the file GRAPHSURGERY (icon grapher). APPLICON can also be extended to explore in parallel several icon structure (assumed to be similar): this is useful for icon matching and retrieval (see Chapter VI).

          The following operators are currently implemented: ADDCHILDREN, DETTACHICONS, MAKEINSTANCE, DESTROYICON, MOVEICON, SAVEICON which are somewhat equivalent to AFTER, DETACH, COPY, DELETE, MOVE, EXIT and can be seen on the iconeditor menu of Figure 26. They do not highlight the icon sketches selected as in the above description, but this is easy to add if necessary.

         

 


 


Figure 26. ICONWORLD and ICONEDITOR Menu

Icon Editing From Projections. Icon editing from projection is presented here as another example of the many other possible ways (yet to be explored) of editing icon structures available with our definition of icon. This technique uses the command MOVE (MOVEICON) defined earlier which is applied on selected icons. In this particular case, the icons presented are projections (on a given plane) of the original icons. This will allow editing of part of the icon structure which cannot be "seen" from the usual view point of the icon editor. This part of the editor has not been tested; it is interesting to implement for specific applications.

          For this approach, an icon is accessed as a whole object by scanning down its structure going from the top (itself) to the roots (CHILDLIST) and displaying successively (one level on top of another on the display screen) all icon sketches of all nodes encountered. The result is a representation of the icon as a series of layers of icon-sketches piled up. Note that icons are accessed very efficiently, since only the icon sketches are displayed and displaced (the real data objects do not have to be accessed from files or retrieved from databases). One inconvenient of this approach is that the display screen can soon become overloaded, and limiting the scope of access of an icon is recommanded. This should be done by specifying the value of SCOPE (within the function APPLICON) which will limit the depth accessed by the editor inside the structure (not implemented).

          Limitation of the exploration of an icon structure can also be provided by matching the icon with a "variable icon" (defined in Chapter II, Icon Types).

          Any operation in the icon editor at this point is understood as window manipulation which modifies the respective position of icons (represented by their sketches) inside the original icon: if you click the left button of the mouse inside an icon sketch, you attach it to the cursor and then by moving the mouse you drag it (with all the icon sketches of lower level attached to it). This has the effect of modifying the actual position attributes (in the list "CHILDLIST" of the icon currently moved).

          Therefore any movement of icon sketches inside the editor window affects directly the structure of each icon concerned.

          Note that many icon sketches are not accessible because they are hidden under other layers of icon sketches. Modifying the spatial position of icon sketches is mostly efficient for the last layer displayed (limited by SCOPE).

         

Editing Between Layers of Icon Sketches. In the process of editing icons as nodes of general graphs (objects of high relational dimension), an intermediate step is to provide the icon editor with some operators to generate planar projections of the icon structure inside a window and to allow modification of objects in that window. This is exactly what is done by the MOVE operator of the Icon Editor presented above: a series of projections of the icon have been previously created level after level (from top to bottom) which are the default representation of the icon structure in the icon manager, and MOVE operate on them. This view point is similar to looking at an organized pyramidal structure from "underneath": each block of the pyramid is itself an icon sketch (a visual projection of the icon as a bitmap).

          Exploiting the idea of a pyramid a little more, we may want to look at it from different angles (view points) and generate projections from the sides which would be relevant of the hierarchy of the icon. If the pyramid is too "thick" in one of the directions of projection, we may want to slice it and only project the slices. The "blocks" building the pyramid are projections of each icon into a bitmap, they may be the real iconsketch bitmaps or simply the icon name printed horizontally or vertically as a title (to save space). Then, the icon can be edited between different layers (levels) of its structure by moving the "blocks" (exactly like we edited icon sketch positions) inside the window where the projection has been produced, we modify the hierarchical structure. In fact, movements of icon sketches are translated into modification of the CHILDREN and PARENTLIST property lists of neighboring icons. (The ATTACHEDTO property for the subcommand ATTACH (DETACH) may also be updated). Moving one block detach it from the others and new links are derived later from the new block organization.

         

Graph Editor

          The Graph Editor introduced now is another way to modify the structure of an icon.

          Like many other graphic tools mentioned in the discussion, it is very easy to use but difficult to specify concisely. An example illustrating the use of the icon graph editor and has been previously published (Clarisse, O. et. al. 1985).

          The graph editor is not directly available in the current version of the icon manager but has been experimented by using the Grapher package of Interlisp-D; allowing the addition or deletion of arcs or nodes within a graph layout of the icon relation CHILDLIST. The icon graph editor is obtained by advising the functions GRAPH.ADDLINKFN and GRAPH.DELETELINKFN of the Interlisp-D grapher so that they call the functions ADDCHILDREN and DETACHICONS to modify the icon structure.

          The function GRAPHICON which has been used to produce several figures of icon structures (e.g. Figures 16 and 23) creates a graph representation of the default relational structure CHILDLIST using functions from the Grapher package of Interlisp-D (Xerox Artificial Intelligence Systems, 1985). Two versions are available in the files GRAPHICON and GRAPHSURGERY (they should be gathered). GRAPHICON constructs an S-expression image of the CHILDLIST structure using a function similar to APPLICON, it then calls the special function LAYOUTSEXPR (Interlisp-D) which produces the graph layout.

          The function GRAPHSURGERY produces directly a graph object from the icon structure (CHILDLIST or another relational property) and displays it in a grapher window. In both cases, the graph editor is made available (but the changes on the graph are not reported onto the icon structure). GRAPHSURGERY has a major advantage on GRAPHICON: since it does not use an S-expression to represent the icon structure, the graph displayed is not limited to a tree graph, and recursive calls, loops in the structure, are also represented by "boxed" nodes.

          All the graphs displayed in the dissertation have been layed out using GRAPHICON or GRAPHSURGERY, the graph editor has only been used to rearrange the physical position of nodes so that they would fit within the thesis format: no node has been added or deleted, each graph displayed is an exact representation of the structure presented.

         

 


 

 

CHAPTER IV

IMAGE PROCESSING TOOLS OF THE ICON MANAGER

 
Halftone Image Representation

          Visual Support for Icon Management. Several well known modern affordable workstations (Sun Microsystems, Texas Instrument Explorer, Xerox Dorado, Dlion or 1108, etc.) and personal computers have a graphic display system which has been influenced by the original work done at Xerox PARC by the Smalltalk group (Goldberg, A. 1984); they have a graphic hardware using bit-slice microprocessors which allows efficient bit block transfer (BitBlt), and they are therefore very suitable for the display and processing of halftone images. BitBlt allows bit block transfer between two bitmaps limited to any rectangular region and affected by a logical operation on bits: and, or, xor, paint, replace, erase. The same workstations have a bitmap display protocol similar to the Smalltalk Model-View-Controller (MVC) (London, R. L. et al. 1985) allowing multi-window displays and scrolling, as well as menu interfacing, which makes an ideal support to develop iconic systems, program animations and visual programming systems (Chapter V).

          Bitmap Display. We will refer to the type of graphic display mentioned above using the term bitmap display or bilevel display as opposed to analog display usually available with cathode ray tubes (CRT), (it should be noted that bitmap displays are often supported by CRT technology). Interest in bitmap displays has been motivated in the last fifteen years by new improvements in display technologies (IEEE, Inc. 1985), and several portable computers are now available with liquid crystal, electroluminescent or plasma display panels. It is possible that bitmap displays will be used as a standard graphic display protocol for computers in the future.

          Halftone Image Techniques. The transformation of an image with multilevels of grey into a bilevel image (or "bitmap image") is called image halftoning. The resulting image is known as a halftone image and can be visualized on a bitmap display. Halftone images have long been used in printing, and are now becoming popular for computer display.

          A straightforward method to produce a halftone image is thresholding; unfortunately it results in considerable information loss. Thresholding should be avoided with images of small sizes (128x128) which have a large grey scale distribution (0-255 grey levels) because important information contained in the grey level distribution is then lost. On the other hand, the use of an adaptive threshold gives good results in the present case of "small" images (Roetling, P. G. 1976). A complete survey of halftoning techniques has be presented by Jarvis (Jarvis, J. F. et al. 1976). Jarvis compares several halftone methods which produce halftone images of the same size as the original grey-level image. In a different context (Roetling, P. G. 1976), a halftone method with edge enhancement is presented which we believe is used to display images on Xerox workstations. This method uses the comparison of a raw image pixel information with information from the halftone screen (the background texture of the display screen) to dynamically adjust a threshold level which is used to produce the halftone image. This technique produces excellent results in "printing" images and texts on top of a textured background.

          Looking for a General Halftone Method. Our approach to image halftoning on the Xerox 1108 workstation has been influenced by a previous algorithm which we had implemented in C-language. On the lisp machine, the use of the bit block transfer function (BITBLT) simplified the programming task considerably, and a few pages of lisp code were more efficient than several pages of C code.

          A general halftone method called BITMAPIMAGE has been written in LISP. The algorithm performs a systematic mapping (through a histogram transform function) of each grey level value of the original image into a bitmap pattern "cut" from BITMAPTABLE and placed inside the halftone (output) image. BITMAPTABLE is a special bitmap used for encoding.

          Filtering operations on the original image are possible at the same time as the halftone bitmap is generated: averaging (AVER or AVERAGE) from any set of local neighbors, skipping (SKIP or JUMP) of pixels, distortion of the image (DISTORT). These operations can also be intermixed on the X and (or) Y axes. Processing can be limited to any sub-window of the original image (clipping region). A presentation of an earlier version of BITMAPIMAGE can be found in the literature (Clarisse, O. B. 1985).

          The BITMAPTABLE object used in the transformation is generated from a default set of 256 bitmap patterns of the size 16x16 which we called the HALFTONEFONTS. These can be edited by the user or by a user program to change the local pixel aspect of the halftone image in a way similar to changing character fonts in a text file. BITMAPTABLE is also one of the first icons (with THINKJET, MAKEICON, MOVEICON, SAVEMENU, etc.) which has been designed for the icon manager. It clearly visualizes an important transformation that occurs in the halftoning algorithm used: the histogram transform function. A 256 byte array (BMARRAY) is used by the algorithm to transform (rescale) the original 0-255 grey levels available; this was necessary due to the poor grey level distribution of some images used. A display of BMARRAY using (SHOWARRAY) is presented in the paragraph: Experimentation Results and Examples.

          The icons SHOWARRAY and PAINTARRAY are respectively used to modify the content of an array by reading a curve painted (or sketched) on the screen by the user. These icons are invoked when BITMAPTABLE is edited by the user to modify the transform function; all this interaction can be entirely visual. PAINTARRAY gives the user the liberty to "draw" a function using PAINTW or the SKETCH process of Interlisp-D.

          BITMAPTABLE is reconstructed at run time from the changes performed on the HALFTONEFONTS and/or BMARRAY. The new bitmap is then used to redisplay the icon BITMAPTABLE.

          Finally, the bitmap pattern used to encode each grey level value from the original image to produce the halftone result can be chosen to be any sub-pattern from HALFTONEFONTS using a HALFTONEMASK; a mask of any size (less than 16x16) which is applied on each HALFTONEFONT. A sub-pattern of size 1x1 produces thresholding, sub-patterns of various sizes (1x4, 4x1, 2x8, 4x6, etc.) have also been used to produce some special effects and/or image distortions presented in a paragraph below.

          Further Research is yet needed to make interaction with BITMAPIMAGE clear and entirely visual. However, it is believed that BITMAPIMAGE is a good example of iconic interaction where a novice user can immediately determine how to display images and rapidly produce the special effects he desires while learning a method of halftone image encoding. Examples of images produced by SHOWIMAGE (a procedure which calls BITMAPIMAGE) are presented in a paragraph below.

          Special properties of halftonefonts Experience with BITMAPIMAGE has given us the opportunity to test several type of halftonefonts. The first halftonefonts were created by editing 256 windows where the HALFTONEFONTS were displayed. Placing at random an increasing number of black pixels in each halftonefont was first attempted. Then our interest focused on a subset of the set of halftonefonts: font classes such that each font element is produced from the previous one by adding one new black pixel each time. These font classes have been named, logical-halftone-fonts due to their property to render meaningful the action of logical operators applied to halftone images. Examples of such halftonefonts (logical-halftone-fonts) can be easily generated manually by growing a geometrical pattern (square, triangle, spiral, star, etc.) by adding one black pixel at a time from 0 to 255 for 16x16 sized fonts.

          Consider two or more grey level images (I1, I2, ... ) and their corresponding halftone images (H1, H2, ... ) using the same logical-halftone-fonts, then:

1. and (H1, H2, ... ) = HT ( infgrey (I1, I2, ... ) )

2. or (H1, H2, ... ) = HT ( supgrey (I1, I2, ... ) )

3. xor (H1, H2) = HT ( | diffgrey (I1, I2) | )

4. xor (H1, (mv H1)) = HT (|diffgrey (H1, (mv H1))|)

          Where HT represents an image transformation using the algorithm described above (the same set of logical-halftone-fonts, the same BMARRAY for histogram transform and the same HALFTONEMASK are used through out the experiment). The keywords "or", "and" and "xor" denote logical operators applied on bitmaps (results of the HT operator). Infgrey, supgrey, diffgrey respectively signify that the minimum, maximum and difference value have been computed at each pixel location to construct the resulting image. Mv (mv) denotes a bitmap translation in the grid coordinate system, HALFTONEGRID, obtained by taking the width and height of HALFTONEMASK as x, y increments.

          The first two properties indicate how several halftone images can be combined together. The third one provides a simple way to display the absolute value of the difference between two images. The fourth property, where mv denotes a translation of bitmap H1 in the grid coordinate HALFTONEGRID, indicates how a very efficient equivalent of a directed gradient operator (Ballard, D. H. et al. 1982) for edge detection can be computed on the halftone image H1. The results of timing the above gradient operators are presented in the paragraph: Experimentation Results and Examples.

          A fifth property can be derived from 1. and 4. which we consider as a logical equivalent for the halftone image H of a Kirsh operator for edge detection with templates of order 1/2, (Ballard, D. H. et al. 1982) applied to the original grey level image I:

5. or ( xor (H, (mv1 H)), xor (H, (mv2 H)), xor (H, (mv3 H)), xor (H, (mv4 H)) ) = HT ( supgrey ( K1 (I), K2 (I), K3 (I), K4 (I)) )

         

         

Where mv1, mv2, mv3 and mv4 are four translation operators (horizontal, vertical, first diagonal, second diagonal) of one step each in the grid system HALFTONEGRID. K1, K2, K3 and K4 are the four Kirsh templates of order 1/2.

          The properties of the logical-halftone-fonts make them specially interesting for use with computer bitmap displays. The bit block transfer protocol provides very efficient logical-operations on bitmaps which may even justify the use of halftone images as a computational media before going back to an original grey level image by decoding the resulting halftone image (IMAGEBITMAP transform).

          Back to the Ordered Dither Method. After experimenting with several logical-halftone-fonts edited manually, a simple, homogeneous set of fonts was required. The homogeneity of the font texture at each grey level was expected to produce halftone images similar to some extent to the original image on analog display. We submitted the problem of finding a set of 256 logical-halftone-fonts of size 16x16, homogeneous at every grey level to our own human problem solver. After half an hour of time shared reflection on the problem, a first answer was formulated: "it's very simple, it's equivalent to finding the order in which the bolts of a cylinder head must be tightened when it is installed on a car engine!"

          The answer was correct but it took a day of trial and error with paper and pencils to produce the final result shown in Figure 27. In Figure 27, a 16x16 matrix is presented which takes once and only once all values between 1 (lower left corner) and 256 (upper left corner). It


Figure 27. The Dither16Matrix

 


 

should be interpreted as follows: the order 1, 2, 3, ..., 256 (which can be followed on Figure 28) is the order in which 256 bolts should be tightened in order to apply at each moment an evenly distributed pressure on a cylinder head of size 16x16; it also indicates the order and the position of each black pixel which must be successively entered to construct a homogeneous set of logical-halftone-fonts.

          The halftone method now proposed, which is the use of the BITMAPIMAGE algorithm with the set of halftonefonts derived from the matrix in Figure 27, is in fact closely related to the ordered dither method originally proposed by Limb and surveyed in the literature (Jarvis, J. F. et al. 1976). Therefore we call Dither16Matrix the matrix of Figure 27, we call Dither-n-Matrix a similar matrix of order n, and the logical-halftone-fonts derived from it are called the Ditherfonts. The BITMAPIMAGE algorithm can provide results similar to any of the ordered dither methods with order 2 to 16 using a unique greylevel to bitmap encoding table (BITMAPTABLE produced from the set of Ditherfonts). The BITMAPIMAGE algorithm also provides an interpolation method to obtain homogeneous sets of fonts of any size 2x2, 3x3, 4x4, ... 16x16 using a different HALFTONEMASKs and the same bitmap table, while the ordered dithers are limited to the orders:
 

		   n
2, 4, 8, 16, ..., 2
 
and therefore to the sizes of fonts 2x2, 4x4, 8x8, 16x16 only. This can be seen from the following recursion


Figure 28. Following the Order of Dither16Matrix

 


 

equations for the ordered dithers adapted from the literature (Jarvis, J. F. 1976):

 
|3 1| |Dn.11 Dn.12| D2 = | | Dn = | | |0 2| |Dn.21 Dn.22|

|(4 x Dn + D2.11 x Un) (4 x Dn + D2.12 x Un)| D2n = | | |(4 x Dn + D2.21 x Un) (4 x Dn + D2.22 x Un)| Dither-n-Matrix = Dn + Un Dither16Matrix = D16 + U16

 
Where Dn is a dither matrix of order n, Un is the unity matrix of order n (a square matrix of size n filled with 1). It should be noted that Dither16Matrix can be obtained by applying the recursion equation thrice from the matrix D2 given above and by adding U16.

          The dither ordered method has been widely used for halftone image generation, and therefore the texture of the images presented in the examples may be familiar to the reader. In fact, we believe that most of the halftone fonts produced here have never been generated before, since they cannot be produced from the ordered dither method. Effectively, the recursion equation of the dither matrix can be initiated only from height (8) different matrices (listed below) and therefore only height (8) different dither patterns of order n can be produced. We demonstrate in the next section that 512 classes of homogeneous logical-halftone-fonts can be easily generated with BITMAPIMAGE to produce halftone images.

          The eight possible dither matrices of order two are given as:

 
|3 1| |0 3| |2 0| |1 2| R1= | | R2= | | R4= | | R4= | | |0 2| |2 1| |1 3| |3 0| |3 0| |1 3| |2 1| |0 2| L1= | | L2= | | L3= | | L4= | | |1 2| |2 0| |0 3| |3 1|

 
The first four matrices (R1, R2, R3, R4) are obtained from D2 given above by circular rotation of its coefficients. Notice that they are all "right-handed": the pattern described by following the order of the coefficients 0, 1, 2, 3 "turns to the right". The last four matrices (L1, L2, L3, L4) are obtained the same way but from the transposed matrix of D2, they are all "left-handed" matrices. R1, L2, R3 and L4 can be obtained from one another by coefficients shifting, so do L1, R2, L3 and R4.

          Finally, it is interesting to notice that a set of dither patterns has been proposed by Bayer (Jarvis, J. F. et al. 1976) which are obtained from minimization of a criterion measuring the low spatial frequency noise introduced by halftone transform. The low spatial frequency noise criterion is therefore equivalent to the more heuristic criterion of evenly distributed "bit pressure" on a bitmap plane which we have used here.

          Further Properties of Dither-n-Matrix The following results have been obtained by observation of Dither16Matrix (and similar matrices of lower order) and by experimentation with the BITMAPIMAGE algorithm.

          We now consider the class of all the logical-halftone-fonts of size less or equal to 16x16 with homogeneous textures which are generated by applying to Dither16Matrix one or more of the following transforms:

1. Flipping the matrix along the middle vertical axis, the middle horizontal axis, the left diagonal or the right diagonal. Any combination of these transformations will produce height possible matrices (the eight possible dither matrices of order 16).

2. Constructing a 32x32 matrix by copying 4 times Dither16Matrix, and taking for example each matrix obtained by shifting the matrix origin in the 16x16 lower left corner. 256 matrices are obtained using this method (four (4) of which have already been obtained in 1). All are different matrices, since at least their lower left element is never the same.

3. Taking the transposed matrix of the original matrix and applying step 2 to it. 256 different matrices are obtained, four (4) of which were obtained in step 1.

          The matrices obtained in step 3 were not produced in step 2, because the transposed matrix of Dither16Matrix cannot be generated by any "shifting" operation on itself (a general proof is given below). Therefore, a total of 512 different matrices is generated from "translations" of Dither16Matrix and of its transposed matrix. That is, 512 different classes of homogeneous logical-halftone-fonts of size 16x16 exist and can all be obtained by simple transformation from one of them.

          More generaly, let n be a power of two (greater or equal to two), then:

1. There exist 2 x n x n matrices of size n x n which can be obtained from one of the four first Dither-n-Matrices of order n and its transposed matrix by "shifting" the matrix coefficients.

          Finally, two more properties of these matrices that can be used to characterize them and to justify the use of the term homogeneity are now given:

2. The sum of all the matrix coefficients from any column of a matrix derived from Dither-n-Matrix by shifting is constant and equal to

         
 

      2
( 2 n   +  n  -  1 ).
 

3. The sum of all the matrix coefficients from any line of a matrix derived from (transpose( Dither-n-Matrix ) ) is constant and equal to

         
 

      2
( 2 n   +  n  -  1 ).
 

This can be shown easily by first using the recursion equation on Dn (and adding Un to obtain Dither-n-matrix). It is then clear that this property holds for matrices obtained by horizontally shifting or vertically shifting the matrix coefficients and therefore by any shifting operation on the coefficients. This property is then extended to the whole set of matrices; and can be used to characterize the subset of matrices Shift (Dither-n-Matrix) (obtained from shifting Dither-n-Matrix) or to characterize Shift (transpose (Dither-n-Matrix) ). This produces a final proof that each of these subsets of matrices is stable by shifting matrix coefficients and are therefore disjoint.

          We will call a halftone font generated from one of these matrices, a homogeneous font and the corresponding halftone transform is called homogeneous halftone transform.

          Generation of Homogeneous Halftone Transforms A bitmap table (BITMAPTABLE) of size 512x512, where each halftone-font pattern has been copied four times besides itself, allows to simulate easily the generation of bitmap images taken from any one of the 256 classes of the homogeneous logical-halftone-fonts described above. Changing of font within this new BITMAPTABLE is done by simply translating the origin of BITMAPTABLE when using the algorithm BITMAPIMAGE. This is done by selection of an appropriate HAFTONEMASK.

          The other set of 256 homogeneous logical-halftone-fonts can be obtained by video inverting the new 512x512 BITMAPTABLE and swapping over the middle horizontal axis the sketch of BMARRAY. Effectively, video inverting BITMAPTABLE is equivalent to generating a BITMAPTABLE from the transposed matrix Dither16Matrix, and the visual transformation on BMARRAY is equivalent to using ( 255 - farray (grey) ) instead of ( farray ( grey ) ), so that the image is not video inverted (farray is the function from BMARRAY). This last transformation is easily obtained by visual interaction with the icon manager as follows:

1. The icon SHOWARRAY is executed displaying a bitmap sketch of the content of the array BMARRAY.

2. An operator from the icon EDITBITMAP is selected to reverse top to bottom the picture of BMARRAY.

3. the icon PAINTARRAY is executed and the picture of BMARRAY is read back and loaded into BMARRAY.

         

Experimentation Results and Examples. Figure 29 presents four successive halftonefonts corresponding to grey levels 65, 66, 67 and 68. Figure 30 present the grey level transform table BMARRAY as displayed by the icon SHOWARRAY and the corresponding BITMAPTABLE for halftone encoding. Figures 31 and 32 show respectively the use of Sketch to draw a grey level transfer function (in BMARRAY) and interaction with the icons PAINTARRAY and SHOWARRAY to


Figure 29. Display of Four Halftone Fonts (Size 16)


Figure 30. SHOWARRAY applied to BMARRAY and the BITMAPTABLE


Figure 31. Using the Icon PAINTARRAY and the Interlisp-D Sketch Process


Figure 32. Visualization of BMARRAY After the Changes

 


 

modify BMARRAY. The images presented in Figures 33 to 36 have been obtained from the BITMAPIMAGE algorithm; they represent bitmaps of size 438x512. Figure 33 represents the original image (DI3) with a poor contrast level (the histogram was mainly distributed between the grey level values 8 and 130). Figures 34 to 44 are a series of halftone images produced by SHOWIMAGE with various sizes and shifting of fonts. All these figures (except Figure 33) use the grey level transformation through BMARRAY represented in Figure 32. Figures 45 and 46 present an example of special effects (horizontal and vertical line textures) available with a specific size of HALFTONMASK corrected by image (bitmap) expansion. The algorithm SHOWIMAGE produces an output icon (window) which displays the bitmap result of a halftone transformation, the title is constructed by SHOWIMAGE from the arguments of the transformation. For example: Figure 41 has the title "DI3:(1 1 6 6)/(AVER AVER 3 3)", the halftone transformation was obtained from image DI3 with a HALFTONEMASK region of (1 1 6 6) and averaging (AVER) in both X and Y directions among nine (3x3) pixels. The HALFTONEMASK is placed on top of each font bitmap so that (1 1 6 6) represents a font of the size 6x6 shifted by (1 1). Other arguments can be passed to BITMAPIMAGE such as a "clipping region" which are not used by SHOWIMAGE. The icon created by SHOWIMAGE can retain the history command if desired (e.g. "(BITMAPIMAGE DI3 '(1 1 6 6) '(AVER 3 3) NIL)"), a copy of BMARRAY and the font type should also be saved.


Figure 33. Original Image: Icon DI3


Figure 34. Halftone Encoding of DI3 with Font Size 2


Figure 35. Halftoning with Font Size 16 and Averaging


Figure 36. Halftoning with Font Size 8 and Pixel Skipping


Figure 37. Timing of the SHOWIMAGE Algorithm


Figure 38. Halftoning with Font Size 4 and Pixel Skipping


Figure 39. Halftoning with Font Size 4 and Averaging


Figure 40. Halftoning with Font Size 6 and Pixel Skipping


Figure 41. Halftoning with Font Size 6 and Averaging


Figure 42. Halftoning with Shifting of Font (Size 6) to the Origin (0 0) and Averaging


Figure 43. Halftoning with Font Shifting to (2 2)


Figure 44. Halftoning with Font Shifting to (3 3)


Figure 45. Special Effect with the BITMAPIMAGE Algorithm: Horizontal Line Texture


Figure 46. Special Effect with the BITMAPIMAGE Algorithm: Vertical Line Texture

 


 

Figure 37 represents the timing of SHOWIMAGE when generating images similar to Figures 33 to 36. In some cases, the real time value can seem excessive compared to the theoretical time; this is due to the disk swapping necessary to reallocate the virtual memory. In general, approximately 30 seconds are required to produce bitmaps of the size considered (438x510).

          Figures 47 to 51 present the result of four logical gradient operators: North-West, North-East, East and North; and an equivalent of the Kirsh operator of order 1/2. Timing results for the first four operators are presented in Figure 52: approximately two seconds are necessary to compute each operator and display the result.

          To interpret the quality of the images presented, it is important to realize that when a HALFTONEMASK of size 16x16 is used, each image is equivalent to a greylevel image of 27 by 32 pixels. A HALFTONEMASK of size 6x6, as used in most figures, produces an image equivalent to a greylevel image of 73 by 85 pixels with 36 possible grey level values.

A Region Growing and Contour Following Algorithm.

          A general algorithm for region growing is presented; it has been designed to retrieve connected regions from halftone images which have a known spatial periodicity of texture. The same algorithm can also be used for generating a region expansion into rectangular regions, (a set of


Figure 47. Logical North West Gradient of Halftone Image


Figure 48. Logical North East Gradient of Halftone Image


Figure 49. Logical East Gradient of Halftone Image


Figure 50. Logical North Gradient of Halftone Image


Figure 51. Logical Kirsh Operator (Order 1/2) applied to a Halftone Image


Figure 52. Timing of the Four Gradient Operators

 


 

disjoint rectangular regions) as well as for boundary growing and contour following.

          In particular, it can give a result equivalent to a quad-tree decomposition when initiated on an image with width and height equal to a power of two; or it can proceed as a classical contour follower when initialized on a region which delimits the boundary of an image object.

          A Model for Cell Reproduction. The technique used is based on modeling cell reproduction in living tissues. Although it can be related to the method of growth geometry coding proposed by Frank et al. (Frank, A. J. et al. 1980) for encoding and transmission of bilevel images, the method presented originated in a very different context and is more general. A survey of other region growing techniques such as region growing via thresholding, splitting and merging can be found in the literature (Ballard, D. H. et al. 1982).

          The model of cell reproduction used here is very general compared to the pattern growing methods used so far in image processing, however it would certainly seem too simple to the biologist!

          The region growing technique now presented extends the definition of patterns geometrically grown to any rectangular black and white texture and to properly connected sub-patterns of the texture; this method seems to be particularly adequate to analyze and encode halftone images. In other words, a "cell" is defined as a rectangular region of any size which has an internal texture (bitmap) and a "gene". The "gene" commands to a cell its progession pattern in the image plane. The texture is used as a reference to decide if the cell belongs to the image or if it should be divided (cell division).

          It seems to us that a lot can be learned from modeling cell reproduction which can be applied to synthesis, analysis encoding and transmission of images. This will be illustrated in the coming discussion. After all it should not be surprising that "seed cells" with special "gene" codes can be generated which can be used to create (recreate) objects in a two or three dimensional image space.

          Algorithm Description A set of "seed cells" is placed inside the image either by a calling procedure or by user selection with a cursor. Each cell is originally defined by a rectangular region (of any size) included inside the image plane. The "seed cells" should be disjoint but may not be all distinct.

          A texture pattern is then extracted from each cell region and passed to the CELLGENERATOR procedure (a matching pattern can also be used to limit the matching to some specific points of the texture).

          The CELLGENERATOR procedure is then applied to each cell; if the texture of the cell matches the texture of the image, the cell is "implanted" and a set of neighboring cells (identical in texture) is generated and returned. On the other hand, if the implantation of one cell fails, the procedure CELLDIVISIONADAPTIVE is called which will insert a set of "divided cells" into DIVIDEDCELLS.

          The overall procedure ends when all remaining cells fail to reproduce and no cell can be further divided.

The result of the procedure is normally reported inside RESULT which is an image representing a "mask" (black over white) of the region extracted from the image.

          Other important results can be obtained using this algorithm procedure:

1. A list of all the cells successfully implanted can be formed which is a representation of the region extracted in terms of disjoint rectangular regions. This region description can be used to encode, transmit and regenerate the original region.

2. A list of all the nondivisible cells which have been successfully implanted, that is a list of point coordinates which can be used to generate a polyline contour of the region extracted.

3. As we already mentioned, the algorithm is designed to extract connected regions from the image plane, but several regions can be grown simultaneously in one procedure from well chosen seed cells. This gives an alternative for producing the region description of any region with a known number of disjoint connected components.

4. When the region growing algorithm is initialized with two seed cells covering the total image plane, the first one having a white texture (WHITESHADE) and the other a black texture (BLACKSHADE), the algorithm will grow all the white regions of the image plane (cells with WHITESHADE attribute) and all the black regions of the image plane (BLACKSHADE). In the particular case of an image with sizes equal to a power of two, we obtain two lists of cells (regions with black or white color) which extensively describe a quad-tree decomposition of the image (note that the list of cells (regions) returned by the algorithm are ordered by decreasing size).

The overall algorithm is described by the procedure ADAPTIVEREGIONGROWER:

  
procedure ADAPTIVEREGIONGROWER arguments (SEEDLIST IMAGE RESULT RESULTREGION)

          DIVIDEDCELLS <- SEEDCELLS CELLGENERATOR <- MAKEADAPTIVEGENERATOR (IMAGE RESULT RESULTREGION DIVIDECELLS)

          (repeat while DIVIDEDCELLS CURRENTCELLS <- DIVIDEDCELLS DIVIDEDCELLS <- () (repeat while CURRENTCELLS CURRENTCELLS <- (for CELL in CURRENTCELLS gather union of (apply CELLGENERATOR to CELL)) ) ) end of procedure ADAPTIVEREGIONGROWER

 

Where, SEEDLIST is a list a cells used to start the region expansion process, and CELL denotes a rectangular region of the image plane on which a given texture pattern is looked for.

CELLGENERATOR is a procedure returned by MAKEADAPTIVEGENERATOR, it produces efficient ways of performing the various tests to implant or divide a cell depending on the image size, the texture searched for and the current size of the cells grown (the user does not have to worry about writing it).

         

An example of a CELLGENERATOR procedure is now given:

  

procedure CELLGENERATOR0007 argument (CELL) stack arguments (IMAGE RESULT RESULTREGION DIVIDEDCELLS)

          if REGION of CELL is inside RESULTREGION then if CELL is not implanted in RESULT then if PATTERN of CELL matches with PATTERN of IMAGE inside REGION of CELL then implant CELL inside RESULT; return (apply CELLDUPLICATION to CELL); else insert inside DIVIDECELLS the result of (apply CELLDIVISIONADAPTIVE to CELL); return (); else return () else return () end of procedure CELLGENERATOR0007

 

Each CELL has a "gene" attribute (GENE of CELL) and a "texture" attribute (TEXTURE of CELL).

          The attribute "gene" is the name of a procedure which is called to duplicate the cell (for example NGENE EGENE SGENE WGENE are currently used to denote cells which are reproducing themselves in the North, East, South, West directions). Other functions can be defined by the user to indicate new directions of reproduction for cells; this provides a way to change the geometry of the cell reproduction pattern.

         

The CELLDUPLICATION procedure is simply defined as:

 

     procedure CELLDUPLICATION
	     argument (CELL)

          return (apply GENE of CELL to CELL) end of procedure CELLDUPLICATION

 

In the lisp implementation, CELLDUPLICATION simply consist in the evaluation of a CELL datatype (CELL is a list with the attribute GENE as first element).

          The attribute "texture" is used to match the cell texture with the local texture of the image before cell implantation; it is only modified when a cell is divided (each new cell will then hold a portion of the texture of the original cell texture).

         

This is done by the procedure CELLDIVISIONADAPTIVE:

  

procedure CELLDIVISIONADAPTIVE argument (CELL)

          NEWREGIONS <- divide REGION of CELL MOTHERCELL <- or (MOTHER of CELL) CELL

          return (for NEWREGION in NEWREGIONS gather (create CELL with GENE <- 'SUBCELL REGION <- NEWREGION TEXTURE <- cut TEXTURE of CELL at region (relativeregion NEWREGION with reference (REGION of CELL)) MOTHER <- MOTHERCELL DIRECTION <- if NEWREGION borders REGION of MOTHERCELL then use direction of NEWREGION with reference REGION of MOTHERCELL else use 'INSIDE ) ) end of procedure CELLDIVISIONADAPTIVE

 

Where, "divide REGION of CELL" usually returns a list of two to four regions which disjointly cover a connected half of the original cell region. For example, if a cell with gene NGENE is divided, the sub-cells produced cover the south half part of the cell; if the cell has a WGENE gene, the sub-cells cover the east half of the original cell. The expression "create CELL with ..." return a new CELL datatype ("<-" is used as in interlisp to denote attribute setting) and,

"direction of NEWREGION ..." usually return NGENE, EGENE, SGENE or WGENE according to the position of NEWREGION in the reference region.

          When a cell is divided, a set of smaller sub-cells is produced each one having the GENE attribute 'SUBCELL to remember that it is the result of cell division.

Note that, except in the case of homogeneous cell texture, a sub-cell should never be duplicated using its own texture. The duplication of sub-cells which is performed by the SUBCELLDUPLICATION procedure uses the texture of MOTHER cell (see below). A sub-cell can be implanted or further divided using its own texture. Therefore a special procedure; SUBCELL must be called within the CELLDUPLICATION procedure to properly reproduce sub-cells.

          A sub-cell has two new attributes; the attribute MOTHER to remember the top level cell from which the sub-cell originated and, the attribute DIRECTION to indicate the approximate position of the sub-cell inside its "mother cell".

          Finally, when a sub-cell has been successfully implanted and has any other DIRECTION attribute than 'INSIDE (that is when a sub-cell has been successfully implanted on the border of its mother cell), it gives the right to its mother to produce a duplicated cell in the direction of that sub-cell.

          This is done in two steps inside a CELLGENERATOR procedure. First when the texture of a cell (sub-cell) matches the local image texture, "implant CELL inside RESULT" is called with CELL (a sub-cell) as argument; then the procedure SUBCELL is automatically invoked when CELLDUPLICATION is called:

  

procedure SUBCELL argument (CELL)

          MOTHERCELL <- MOTHER of CELL NEWGENE <- DIRECTION of CELL

          if NEWGENE equal 'INSIDE then return (SUBCELLDUPLICATION CELL) else return (insertcell (create CELL with GENE <- NEWGENE REGION <- translateregion REGION of MOTHERCELL in the NEWGENE direction TEXTURE <- TEXTURE of MOTHERCELL ) inside (SUBCELLDUPLICATION CELL) ) end of procedure SUBCELL

 

When a sub-cell is successfully implanted on the interior border of its mother cell, the procedure SUBCELL returns one more cell (added to the result of (SUBCELLDUPLICATION CELL)) which is identical to the mother cell but translated in the sub-cell direction. The SUBCELLDUPLICATION procedure performs cell duplication inside the area of the original mother cell and uses the mother cell texture. It proceeds similarly to CELLDUPLICATION but restricts the expansion domain of the new cells to (REGION of MOTHERCELL) and substitutes in each NEWCELL the texture attribute (cut TEXTURE of MOTHERCELL at region (relativeregion (REGION of NEWCELL) with reference (REGION of MOTHERCELL)).

To complete the description of the region growing algorithm an example of a "gene" function (NGENE) is now given:

 

procedure NGENE argument (CELL) CELLREGION <- REGION of CELL CELLTEXTURE <- TEXTURE of CELL

          return (listof (create CELL with GENE <- 'NGENE REGION <- translateregion CELLREGION in the NGENE direction TEXTURE <- CELLTEXTURE ) (create CELL with GENE <- 'WGENE REGION <- translateregion CELLREGION in the WGENE direction TEXTURE <- CELLTEXTURE ) (create CELL with GENE <- 'EGENE REGION <- translateregion CELLREGION in the EGENE direction TEXTURE <- CELLTEXTURE ) end of procedure NGENE

 

When an NGENE cell is reproducing, it creates another NGENE cell, a WGENE cell and an EGENE cell, therefore, a cell progressing to the north (N-GENE) is not limited to only producing cells which progress to the north. Again the user is free to define new functions than the ones currently used (N/S/E/W/GENE) to create other patterns of cell reproduction; for example, NE/SE/SW/NW/GENE have also been used, and geometrical progressions similar to those of a chessboard game can be used. It should be noted that "loose" growing schemes can generate non connected regions.

          Experimentation and Examples. Two sub-algorithms of the algorithm presented above have been successfully implemented on a lisp machine Xerox 1108 and are currently used in the icon manager: ICONMASKADAPTIVE and REGIONGROWER.

ICONMASKADAPTIVE has been designed to automatically generate icon masks from icon bitmap images so that icons with any desired shape can be displayed on the screen.

          It uses a general procedure of cell growing inside an image, but the set of seed cells is limited to a set of cells placed on the frame of the original image. ICONMASKADAPTIVE is usually used for growing a white region around the icon sketch (white is the default background color for an icon sketch).

REGIONGROWER has been designed to extract connected regions from the screen which have been produced by periodically reproducing any given texture pattern in the coordinate directions of the image plane.

          It performs a matching of the texture pattern inside the image and extends the region of matching to produce a region mask. It does not allow cell division, and therefore the mask returned will approach the true region boundary with an error less or equal to the texture size.

          The REGIONGROWER procedure is useful to perform some "reading" from the screen. It is used to extract regions of homogeneous bitmap texture from any bitmap. This technique can be applied to image understanding, image compression, and even image transmission over an ethernet link (using the "evalserver" package between two Xerox 1108 stations for example). The region mask (returned by REGIONGROWER) is useful to efficiently reconstruct the original region by filling it with the original texture. It can also be used to change the texture or the color of the original region.

          REGIONGROWER is an easy way for the user to paint or replace locally the colors from a sketch drawing or from any halftone image displayed on the screen. It is also an important tool for icon design.

          The algorithms REGIONGROWER and ICONMASKADAPTIVE are the basic operators for several functions which have been used in Chapter II and III: CONTOURFOLLOWER, GROWERDEMO, ICONICMAPDEMO, MINIMUMENCLOSINGREGION, MAKEICON,

MAKEICONTRANSPARENT. They have been largely used to construct the icon MAPEXAMPLE.

          Figure 53 concisely gathers several interesting properties of the region grower technique: it is applied to a bitmap of size 128x128 to automatically grow its "iconmask". Several transparent icons have then been generated and filled with various textures. Figures 54 and 55 contain timing results to compare the two region growing methods (REGIONGROWER and ICONMASKADAPTIVE in the case of Figure 53): Figure 55 demonstrates the superiority of the adaptive cell division method in practically all cases.

          Figures 56 to 65 represent a similar study in the case of a large bitmap. The time required to construct an "iconmask" of size (438x510) in the case of a real image is still very reasonable if precision to the order of a pixel is not required.

          Halftone Image Transmission and Reconstruction.

Communication of halftone images is a very important issue with regard to the possibility of transmitting visual information over a local net between two workstations or over a phone line between personal computers. It has applications with respect to computer conferencing systems (Sarin, S. et al. 1985) as well as multimedia message management systems (Forsdick, H. C. et al. 1984). It is also of great importance to us since it partly solves the


Figure 53. The REGIONGROWER Algorithm with Adaptive Cell Division and Creation of a Transparent Icon


Figure 54. Timing of REGIONGROWER and ICONMASKADAPTIVE algorithms


Figure 55. Timing Graphs for Comparison of two Region Growing Methods


Figure 56. Growing an Icon Mask for DI3: Level of Cell Division Equal to 3


Figure 57. Growing an Icon Mask for DI3: Level of Cell Division Equal to 4


Figure 58. Growing an Icon Mask for DI3: Level of Cell Division Equal to 6


Figure 59. Growing an Icon Mask for DI3: Level of Cell Division Equal to 7


Figure 60. Growing an Icon Mask for DI3: Level of Cell Division Equal to 8 (Bit Size)


Figure 61. Superimposition of Icon Sketch and Icon Mask


Figure 62. Changing the Background Texture of DI3


Figure 63. MAKEICONTRANSPARENT? Applied to DI3: Icons With Text in the Background


Figure 64. Timing of ICONMASKADAPTIVE Applied to a Large Bitmap (DI3)


Figure 65. Timing Graph (Without Swapping Time) as a Function of The Level of Cell Division

 


 

problem of transmitting iconic objects between workstations. This problem is the object of further investigation in Chapter VII.

          In the following, we discuss several possible schemes to transmit and reconstruct halftone images, based on the region growing algorithm presented above.

          Transmission using the Region Grower. The transmission schemes for halftone images presented here have not yet been tested, and should therefore be considered as a description of future work. In some cases, a result expectation may be formulated in comparison to a method used previously, it should be considered as the author's speculation.

          The first halftone image transmission scheme is described by the following steps:

1. The "region grower" is first applied to a halftone image on workstation A with a set of seed cells placed on the image frame of A which take for texture attribute, the local image texture. Several regions are grown in parallel and expanded towards the center of the image. A black mask of the union of the regions grown is retained as well as the lists cells which died on the image plane (they delimit the boundaries of each region).

2. The gene attributes of all boundary cells from 1., are reversed (NGENE becomes SGENE, EGENE becomes WGENE, etc.) and the new lists of cells are transmitted to workstation B where a region grower process is started using the cells received as seed cells. On workstation B the regions are recreated in the reverse order of the way they were extracted from A. The growing of a region ends when the image boundary is reached (a mask of the regions grown is also retained on workstation B to limit the growth of regions transmitted in future steps).

3. Some important regions may have been left from the first growing phase in workstation A. New seed cells are therefore produced from the lists of cells retained in 1., with their original gene. These cells are "revived" by duplicating them and giving them a new texture attribute: the local texture of the image.

4. Step 1, 2, 3 are repeated until no cell can be revived in workstation A.

          The choice of original seed cells for halftone image transmission is arbitrary: a few cells placed around the center of the image could be used instead of frame seed cells. Further experimentation with the region growing procedure is needed.

          Gene Codes for Better Transmission. The term GENE has been used in the description of the region grower algorithm (as in NGENE, EGENE, etc.) to mark the fact that each "gene" is a function definition which is responsible for the behavior of a cell in the image plane. The "gene" definitions determine precisely how a given set of "cells" will expand and produce an image pattern; in the current implementation of the region growing algorithm, the gene functions are very simple: they only remember how a cell is supposed to progress in one direction and not in the other directions.

          When a region is growing, certain cells can survive and progress in one direction or the other, other cells cannot be implanted or divided, and therefore "die". During this period of expansion, more information about each cell could be recorded and encoded into the "gene" function (such as the number of times one cell has been able to reproduce itself before dying, etc.) and used to reconstruct the region extracted. This could provide a more efficient encoding scheme for regions of the image: there is no reason to retain the list of all the cells on the boundary of region which have been grown, if the region can be recreated by growing a few well selected seed cells.

          In other words, once several regions of a halftone image have been expanded from a given set of textures using the algorithm described above, it is possible to retain the list of boundary cells produced, to transmit it and to reconstruct the image from it, but it may be better to backtrack and find out which original seed cells would allow more efficient region reconstruction from their "gene" information (the life time of each cell may be recorded in the gene for example).

         

Selecting "seed cells" and extracting their "gene" functions from the observation of the growing image patterns is somewhat similar to the construction of a pictorial grammar as described in the literature (Fu, K. S. 1974).

          Backtracking for Seed Cells. Choosing and transmitting such "seed cells" is an extension to halftone images (using the "region grower" algorithm) of the method first proposed by (Frank, A. J. et al. 1980) for compression and transmission of black or white regions from bilevel images with "high perceptual quality of progressive display, high encoding efficiency ...". In the context of our region growing algorithm, the backtracking scheme proposed by Frank still holds if the sub-cells from a region boundary are replaced by their mother cell.

          Backtracking to find seed cells which can best regenerate the region is then equivalent to shrinking the region obtained by replacing all sub-cells (in the list returned by the region growing algorithm) by their mother cells. The shrinking operation needed is the reverse operation of a region growing with fixed size cells; it can be related to a skeleton extraction method.

          A region shrinking operation may be costly and should only be performed if the bandwidth of the communication link requests it. Finally, it is possible, and experience with the transmission methods proposed will tell,

that certain regions are best encoded for transmission using boundary cells while others are best encoded using selected seed cells and genes.

          Conclusion and Future Research

          Although image coding efficiency using the region growing technique presented has not yet been tested (its study is beyond the scope of the present work), the author believes that the representation of regions in halftone images by cells and textures as suggested in the discussion can provide an efficient coding scheme and sufficient compression rate for the transmission of a large variety of images and therefore can be applied to data communication between iconic systems (a problem related to multimedia message management).

          Further research may be enterprised to improve the efficiency of coding halftone images by using growth geometry coding from well chosen seed cells using the method mentioned above. The method proposed by Frank (Frank, A. J. et al. 1980) for growth encoding of bilevel images would probably produce very poor results when applied to halftone images encoded by the ordered dither method (Jarvis, J. F. 1976) since it can only grow black or white bit sized cells (1x1). Frank suggested that with halftone images, several bilevel image planes can be used in that case, but the algorithm may then have to run in 64 image planes when an 8x8 dither is used to produce the halftone image. We point out that, due to its ability to grow regions of texture pattern produced by halftone transformation (using homogeneous halftone fonts) directly in one image plane, the region growing method proposed combined with a geometry coding scheme must be better fitted for efficient encoding of halftone images.

          The encoding of cell "gene" functions (a technique similar to the extraction of a picture grammar (Fu, K.S. 1974)) should also be studied until complex image patterns can be reconstructed from a small number of seed cells.

          An alternative to this type of image transmission uses image decomposition into a structured icon and transmission of the icon. Although the actual image decomposition into icons usually relies on region growing techniques (see Chapter II), it is possible to study and experiment with the transfer of structured icons (designed interactively) as an approach to progressive image transmission. A method of icon transmission is proposed in Chapter VII which allows progressive transfer of a hierarchical structure of images (first transmitting one level of structure information (e.g. property CHILDLIST of the icon), second transfering icon sketch descriptions).

         

 


 

 

 

CHAPTER V

VISUAL PROGRAMMING IN THE ICON WORLD

 
Visual programming did not seem to attract the attention of many researchers until recently. We are tempted to think that this new interest for visual interaction with computer and for the so called "visual languages" (they still need more work to be classified as computer languages) has been triggered by the results of the Smalltalk research group at Xerox (Goldberg, A. 1984), but we will not forget the large number of research groups working with computer graphics, image processing, computer visions, image databases, relational databases, query languages, etc., who have all worked to "do it more visually".

          With the coming of modern workstations (and some personal computers) with high level graphic display protocols, it is now possible for a large group of people to experiment with visual interface systems. Research in visual programming is also important to us because it prevails information exchange between human and machine (and between humans through machines) using the largest bandwidth communication channel most humans have the privilege to have: a minutely organized and highly parallel architecture of neurons which constitute our vision system, (Restal, R. M. 1984).

          Visual programming techniques are usually treated differently concerning program visualization or program visual design. We personally think that these two aspects should be treated as one: program visualization can be a debugging tool for visual languages, it is also an interface with visual programs. A visual language must include both a program visual design methodology (visual procedure editor) and a large set of program visualization tools so that visual programs can interact and be debugged visually. We also expect visual programs to be able to accept visual input and produce visual output.

          We review in the following different techniques which are currently used to visualize programs as well as to design programs visually. The reader will judge for himself the limitations of each method as an approach to visual languages. We then propose a more general approach to the design of visual languages which is based on the use of icons as we have defined them for experimentation with the icon manager. A survey of Graphical Programming Techniques can also be found in the literature (Raeder, G. 1985).

          Unless a specific reference to a paper or to a procedure written for the icon manager is given, many of the ideas mentioned in the following are the author's own speculation from his experience with a lisp machine environment, they have not been fully tested.

         

A Survey of Visual Programming Techniques

          Program Visual Design. A common approach to visually design programs is the use of flowchart diagrams. Flowchart diagrams have been used for about thirty years to teach the basics of programming as well as to represent complex software organizations. A flowchart is visual and its interpretation only requires from the user to know a simple model of the sequential machine; it does not rely on the knowledge of any specific programming language and represents clearly the control flow mechanism involved in a procedure.

          The flowchart approach is limited to the representation of sequential tasks. Other visual models have been designed to represent the control mechanisms involved with procedures executed on parallel architectures (e.g. UCLA graphs, dataflow diagrams). These enhanced visual models of computer procedures are usually created by introducing new symbols to specify the parallelism and awaiting of events. These models could be used as well for visual programming.

          Visual Programming Example with Flowchart. An example of a visual language using the flowchart construction approach is Pict (Glinert, E. P. et al. 1984). Pict allows the user to design procedures which are translated into a small subset of PASCAL instructions from a flowchart. In Pict, flowcharts are constructed using a predefined set of icons inside an editor window, extensive use of color is made (in particular, to distinguish between variables). A report on user response is given which seems to demonstrate the advantage of a visual approach to programming in PASCAL in the case of novice students.

          The Nassi-Shneidermann structure diagram is similar to the fowchart diagram representation, but has the aspect of a form partitioned in rectangular boxes and triangles (Glinert, E. P. et al. 1984). It can be used to represent and design programs whenever the graphic capabilities of the display are limited.

          Visual Programming Example with State Diagram. Similar to the flowchart design approach is the use of state diagrams to visually define procedures. State diagrams are widely used to represent algorithms and seem to be particularly suitable for the design of user interfaces. A visual programming language using state diagram construction has been partially implemented on a SUN workstation at Naval Research Laboratory, Washington, DC (Jacob, R. J. K. 1985). The language proposed allows the design of user interface procedures from graphical construction of (state) diagrams and sub-diagrams. Although it is not specified how the diagram editor will function, the interpretation of diagram procedure is illustrated using the example of a desk calculator. Input and output token actions are programmed in C-Language and nonterminal transitions result in calls to a sub-diagram which must be completely traversed to satisfy the original transition.

         

Other techniques can be used as visual programming tools in a similar fashion: Petri net, finite state machine diagram and other diagrams used in automata theory.

          Building from Blocks Programming. A three dimensional program design technique using iconic blocks is suggested by Glinert (Glinert, E. P. et al. 1985); it can be seen as an extension of the previous work on Pict which we have just mentioned. The design of a procedure would use easy to learn, natural syntactic rules similar to children building from block toys. Blocks are cube-shaped objects with knobs and sockets to specify (or restrict) the possibilities of block association. An example of possible application displays a few blocks for Algol syntactic keywords and their projections in a two dimensional plane. One advantage in the use of three dimensions is for representing parallelism.

          Draw What It Is and See What It Does. It is clear in the context of the previous examples, how program visualization can be generated by flashing module elements from a flowchart or a state diagram, bringing up sub-graphs (which will disappear after execution) when a call to a subroutine is performed. In the general case, visual programming must assist the user with a procedure design problem by producing a clear picture of the algorithm and providing visualization tools to animate the algorithm picture when a sample execution of the procedure is performed.

          A visual programming language should provide an environment where the user can draw a task and then see this task beeing executed. It is often argued that some computer tasks cannot be visualized (animated) as easily as image processing routines. This may not be correct, after all the model of a Turing machine is essentially visual, the architecture of a computer is physical and visual and the effect of a microprocessor instruction can be visualized as well. On the other hand, it is not necessary to visualize everything: program visualization may be limited to programming "in the large" (organizing large software systems from iconic representation while low level functions are written in a traditional language).

          In a different context, it appears to us that the visualization of situations, the creation of visual models for new concepts can very well be the support of the powerful "problem solver" that is the human brain. How many of the concepts that we manipulate have a visual representation conscious or unconscious?

          Program Visualization.

          PegaSys is a system implemented in Interlisp-D to visualize program designs (Moriconi, M. et al. 1985), where an example of visualization of a network protocol is presented. Various degrees of refinement can be obtained when "observing" the protocol mechanism. Several iconic objects are placed in a "flat" topology (with no overlaping), e.g. host, line, packet, read, write, sender, receiver and may only appear when the degree of visualization is fine enough.

          London has presented another form of program visualization named "program animation" (London, R. L. et al. 1985). In particular, animation of an algorithm for finding longest common subsequences in two strings is presented. This system is implemented with Smaltalk. It apparently uses a set of animation tools which can be inserted in a program and are updated while the program is running.

          A knowledge programming environment built on top of Interlisp-D, LOOPS (Stefik, M. et al. 1983), has inherited object programming features from Smalltalk, and possess interesting animation capabilities known as gauges which are used to display the value of "active variables". Gauges can be inserted as probes in a Loops program and display value changes in form of vertical scale, dial, meter, etc.. Gauges are also useful as debugging tools.

          Conclusion on Visual Programming Techniques. From the study of visual programming examples as the ones presented above, it is not clear to us if a general approach to visual programming is commonly adopted or not. These recent "visual programming approaches" seem limited to their author's own definition of a visual program, they either offer an apparently rigid syntax to organize program blocks into procedures or the "constuction rules" provided by the system are simply not mentioned.

          We believe that further experience with the visual programming approaches developed recently will help to produce programming methods which reflect more widely the advantages provided by visual interaction.

          Efforts in psychology and cognitive science to analyze the mechanism involved in man machine visual interaction will also be helpful to accomplish this task (Rohr, G. 1984).

          Program Visualization Using Video Games

          User reaction in front of a visually active computer system can be inferred from the study of video games. Video games have the ability to specify rules on object dependencies and object displacements without any ambiguity and with almost no use of textual information by simply presenting a simplified scenario of the universe that the user will enter when inserting a coin. This property makes viseo games both easy to learn and attractive to use. In other words, video games state themselves clearly by displaying an animated icon of the game procedure which will become interactive by icon selection (insertion of a coin). It should be noted that video games have largely augmented the variety of games situations which can be proposed to players.

          A similar effect is expected on the class of programming languages when icons are used as a programming support. Note that interaction with an ideal iconic environment is not limited to user interaction through mouse buttoning or keyboard (or joy stick as in video games), but may include images input from a camera or speech input. In multimedia document and communication systems such interaction in the broad sense is considered (Forsdick, H. C. et al. 1984), (Sarin, S. et al. 1985), (Reynolds, J. K. et al. 1985).

          To conclude with this reference to video games as program visualization tools, we shall say that "video games" may become an interesting way to solve complex problems: it can be easier to construct a video game to analyze a problem visually and find a solution to it from "playing the game" than attempting resolution of the same problem from traditional computational methods.

          In fact, we propose the use of video game type of programming to explore visually NP problems and to visualize several paradoxal situations taken from "An Eternal Golden Braid" (Hofstadter, D. R. 1980). The use of program visualization to explore an NP problem is already practiced to a large extent by research groups which have been creating and exploring fractal objects on computer systems (image synthesis) in the past ten years (Mandelbrot, B. 1984).

          Visual Programming in The Icon World

          Although we are not providing in the following a complete syntax for a visual language nor a detailed method for designing visual languages, we define a goal which, according to us, should be approached and soon attained by visual languages. We then illustrate how the techniques that we have designed for icon management (Chapter II and III) can be successfully used in the quest for visual languages.

          Programming with Extended Visual Interaction. A visual language must provide the user with a way to organize "tasks" and define program procedures by visual interaction. Visual interaction (extended) can be interpreted as one or several of the following types of interaction:

1. Definition of template forms and forms using techniques similar to those developed for form management systems and for query languages, see for example (Chang, N. S. 1981) and (Shu, N. C. 1985).

2. Menu selection using a mouse, a light pen, joy stick or other techniques of cursor positioning.

3. Icon selection where icons are represented inside menus or forms as in 1 and 2 (icons are shown from their sketch or name).

4. Icon design using the techniques provided with the icon manager and icon interpretation.

5. Static interpretation of pictures drawn and painted. Dynamic interpretation of pictures (e.g. any cursor movement is dynamically followed and interpreted). Image input from a camera.

          The first three visual interaction methods are largely used and are used in particular with the systems surveyed above. The fourth method (which also uses methods 2 and 3) is where our interest is now focusing. The fifth method is still largely opened to research; it has been illustrated earlier with examples of user interaction icons where information from the screen is read back and interpreted to create or modify variables (datatypes), e.g. input and output icons, READREGIONFROMSCREEN, UPDATEICONSKETCH and MATCHICONSKETCH from Chapter II, SHOWARRAY, PAINTARRAY and ICONMASK from Chapter IV and MATCHICONSKETCH from Chapter VI). It should be noted that a complete man machine pictorial interaction will only be realized when a certain form of visual intelligence is made available in a computer environment. We believe that a structured object approach (as in icon management) will also play an important role in the design of visually intelligent computer systems, but the object of this discussion is to illustrate how the icon manager can be modified into a visual programming environment.

          A Visual Language Scenario. A scenario for a complete visual language must clearly present the design of a visual procedure (a visual representation of a program procedure) from pictures, drawings (e.g. arcs, oriented arcs, etc.) and by editing and placing pictorial objects at specific (relative) locations on the display screen.

          The execution of a visual procedure must appear to the user as if it was performed directly by interpretation of the logical relations between picture objects (icon relations) and must therefore involve (exact or partial) recognition of pictorial symbols (icon interpretation, icon indexing and icon matching).

          Design and Interpretation of Iconic Procedures. The problem of creating and editing visual procedures is limited to the use of tools available for icon management. In other words, we restrict ourselves to visual procedures obtained by "putting together" a set of well defined icons on the screen, where each icon has a function definition (an s-expression with its CAR equal to LAMBDA or NLAMBDA or a piece of compiled code). These procedures are called iconic procedures.

          The interpretation of an iconic procedure consists in deriving from recursive exploration of the icon relations a sequence of executions for each sub-procedure (a LISP function or another visual procedure) as well as all the information necessary to specify argument transfer from one procedure to the other. Therefore, it is assumed that control flow and data flow can both be derived from the icon structure which define the visual procedure.

          The Advantage of Iconic Procedures: Dimension of the Icon World. If we succeed in defining conventions for icon placement which consistently provide visual encoding of control flow and

data flow information, we have a way to increase the dimension of the program design space.

          A program design space is defined as the environment available to the user to create and modify a computer procedure.

          The dimension of a program design space is defined as the dimension of the "space" provided by an editor program to create and modify a procedure. This dimension is considered to be equal to one (1D) when a simple line (or text) editor is used since a program is then defined and modified as a sequentially ordered set of instruction. In the case of structure oriented editor (as available with a Lisp Machine), this dimension is estimated to be equal to one and a half (1D1/2) since expressions are modified as if editing a tree structure (dimension 1D1/2) by adding and deleting nodes.

         

          The tree graph model underlying a structure-oriented Lisp editor or an icon graph editor (Chapter III) can be related to the representation of a tree graph as a fractal curve which has a dimension varying continuously from 1 to 2 depending on the pattern reproduced by recursion (Mandelbrot, B. 1983). For example the dimension of the fractal binary tree graph of Figure 66 has been computed and is equal to 1.6942.

          Similarly, with more complex icon structures, or when editing several icon relations (CHILDLIST, PARENTLIST, etc.)


Figure 66. A Binary Tree Graph

 


 

within the same icon editor; the underlying model is of higher dimension (the tree graph representation has more "branches") and we will consider this dimension to be equal to two and a half (2D1/2).

          The Icon World is then a programming space of dimension approximately equal to two and a half (2D1/2) where a set of flat pictorial objects (icons) of any shape can be defined, organized and related to other icons. We propose to use this "larger" programming space as a support for visual program design.

         

Visual Functional Programming

          The approach to visual language design presented here is what we call visual functional programming, it differs from the visual programming schemes previously presented in that the principal control mechanism visualized by visual functional graphs is function calls.

          Functional graphs seem very natural for us to use in a LISP environment. Several tools are available in Interlisp-D (Xerox Artificial Intelligence Systems, 1985) which allow browsing of a function definition (the BROWSER package which is very similar in effect to the Smalltalk Browser, (Goldberg, A. et al. 1983)) as well as tree graph editing (GRAPHER). The Browser opens a graph representation obtained from exploration of a function definition s-expression (not compiled) where the function calls present in the definition are displayed as a tree graph (forest layout) where each node is the name of a function called. The definition browsed is displayed inside a graph editor window and function definitions can then be edited as LISP code by mouse selection of a browser node.

          Improving the Function Browser. The function browser can be easily extended by "advising" (Xerox Corporation, 1983) function definitions so that each node from the graph layout can be considered as a hypernode (it can be browsed itself by selection with the mouse). In the context of the icon manager, if the function has been associated to an icon sketch, it is also straightforward to replace the function name in the Browser graph by its icon sketch as we have already done for icon representation. It may also seem frustrating to the user to be able to explore a function definition as tree graph without being able to modify the function definition directly from editing the tree graph as we have done earlier with icon graphs (Chapter III). The only way to edit a function is yet to edit its LISP code. In other words, the improved version of the function browser is perfectly fitted to visualize a procedure as a graph of icon sketches or function names, but it cannot be used to visually design or modify that procedure. Note finally that it is possible to "advise" the Browser to allow editing of the tree graph representation of a function and we can write a function to read back the edited graph into a function definition in LISP, but a function produced this way is most likely incorrect because no information on function calls, and arguments are available inside the Browser graph and therefore cannot be restored from the tree exploration. Note that it is possible to replace any function call by a sequence of function calls of a single argument (Barendregt, H. P. 1984), and therefore to provide a functional graph which can be edited consistently, but this is impractical.

          Examples of graphs produced by the function Browser are shown in Figure 67 and 68. In the case of Figure 67, the graph represents the functions called from THINKJET; the graph has been limited to exclude lower level functions which were not compiled (the Browser omits all calls to compiled functions). This graph has also been arranged by moving nodes for better clarity, no node has been added or removed. Figure 68 presents GRAPHICON; it has been rearranged to bring up the symmetrical structure of the function: it belongs to an early implementation of icon structures which are accessed differently depending if the argument is an open icon (window) or a saved icon (atom). It is clear from this graph representation how the algorithm can be unified and this has been done with GRAPHSURGERY a more efficient version of GRAPHICON. Notice that in these two examples several functions are called recursively: this is visualized clearly on the graph by two identical boxed nodes connected by a single arc (other conventions are also acceptable).

          The visual programming environments presented in the above survey using flowchart, Nassi-Shneiderman diagram, state diagram or Petri net have a similar lack of


Figure 67. Function Browser Applied to THINKJET


Figure 68. Function Browser Applied to GRAPHICON

 


 

information considering procedure calls (they only represent graphically program flow or control flow). They clearly cannot provide efficient ways to edit a procedure from its graph (unless very restrictive constraints are imposed to the functions such as a single argument per function call) and they do not provide any information on the datatypes either (a unique datatype for all arguments should be imposed). In the case of Nassi-Sneiderman diagrams, datatypes are specified but cannot be visualized and function calls are not displayed. Dataflow diagrams provide a way to display procedures by visualizing arguments transfer, but no work seems to have been done to display visually general datatypes in dataflow diagrams.

          Each one of the previous methods uses a diagram representation which is an incomplete projection of the information contained in the original procedure and cannot be used as a general tool for visual program design.

          Topology for Visual Functional Programming. A solution to the problem of visually displaying a functional procedure so that it can be edited by visual interaction is proposed.

          We define a visual functional graph to be a graphical representation of function calls specified by the following topological rules:

1. Any function represented by its name or icon sketch on the screen can be expanded into a visual functional graph, similar to a graph editor window where the function name (or sketch) and a representation of its argument(s) figure on the top part of the window as a large title. A visual representation of the object (or datatype) returned by the function is presented in the bottom part of the window. The inner part of the window displays the graphical representation of the function body and can be scrolled in both directions if not all information can be seen at once.

2. A node of the visual functional graph is a function name (as it is invoked when a call to that function is performed) or its icon sketch as defined in the icon manager.

3. Arcs are drawn from each argument of the top part of the window to each node where the argument is requested by a sub-function call. Arcs are layed out in the order obtained from the argument list of the sub-function. Closed loop arcs are not allowed; recursive function calls are created by placing a sketch representation of the function edited inside its own editor window.

4. Each node is a hypernode in the visual functional graph and can be expanded into a similar graph to get more information on the internal structure of a sub-function.

5. Arcs are drawn from the bottom part of each node to the top part of other nodes and to subparts of the visual representation of the object returned by the function visualized. Visual representation of this object structure can be inferred from the position of the arc connections.

          We postulate that these five rules (or similar ones) can be used to produce a complete visual representation of a function as defined in the theory of lambda calculus (Barendregt, H. P. 1984), in functional programming (Backus, J. 1978) or functions in LISP. This general methodology has been succesfuly tested in the case of visual circuit design as demonstrated in a paragraph below.

          Note: by "complete visual representation", we intend a representation which allows visual editing of any constituent of the function.

          It is assumed in the topological rules that variable and datatypes have a visual representation (icon), and this is already true in the Interlisp-D environment where any datatype can be inspected (INSPECT) and represented by menu like images (we suggest that more abstract images should be available as icon sketches for datatypes). Special icon sketches can be introduced to represent internal variables in a function, as well as conditional statements in order to complete the visual functional graph if desired.

          The proposed functional graph visualizes each function call by several arcs so that each arc is a connection to one variable (or part of the variable if it is a list or a datatype). In the simplest case; one arc per variable in the call (whatever its type is), the graph representation is equivalent to lambda calculus (in LISP for example). A similar interpretation is possible and we can therefore go back from the icon graph representation to a lambda expression (the same syntax errors are also possible). If we imagine that visual representation of datatypes and variable structures are available, many semantical errors can be avoided, and the visual representation is definitely a plus for programming.

          Simplified or incomplete examples of functional graphs are presented in Figures 68 and 69. Figure 68 illustrates the difference between the graph obtained by Browsing of the function BITMAPMEASURE (BITMAPMESURE returns a measure of the bitmap information) and a complete graph of its LAMBDA expression. It is possible to edit the function definition from the second graph by editing the tree structure and this is exactly equivalent to what a structure-oriented editor does to Lisp code. This type of tree graph representation is not what we propose to allow efficient visual programming of visual procedures. Figure 70 presents a rearranged tree graph of the function BITMAPNIL? which tests if a bitmap contains any (black) information.

          Editing functions from such tree graph representation is already possible; advising the functions GRAPH.ADDLINKFN, GRAPH.DELETELINKFN, etc. of the Interlisp-D Grapher (Xerox Artificial Intelligence Systems, 1985) is possible so that changes in the tree graph directly correspond to changes in the function definition.


Figure 69. Function Browser Applied to BITMAPMEASURE and a Graph of the Complete Lambda Expression


Figure 70. Tree Graph of the Lambda Expression of the Function BITMAPNIL?

 


 

Visual Functional Graph: Parallel Processing. Assuming that programming is done in a (recommended) style such that functions do not produce side effects, it is possible to derive parallel control flow mechanism from exploration of the visual functional graph. For example, evaluation of a visual procedure can be done from the bottom of the graph to the top such that any group of arcs coming from a same function call is processed in parallel, and a function from the graph is evaluated only when all its arcs have returned a result.

          It may also be decided that as set of function sketches which are partly overlaping is such that the functions which overlap are sharing variables or, on the contrary, are absolutely transparent to each others' action and can therefore be executed in parallel when the architecture allows it.

          Designing such "parallel procedures" which are interpreted using independent processes can be used even on a single processor machine to improve the running time required by a procedure when user interation is involved. The fact that the user does not answer (or do any selection) does not keep other side processes (which do not depend on user input) to run. In other words: "give the machine as many things to do as you want: it will perform them in the best possible order (for itself)".

          Visual Functional Graph: Alternatives. A function can be linked to the others on the screen by a certain number of arcs placed in order from left to right, they represent argument transfer. Several presentations of a same algorithm are possible with more or less detail on the actual function calls. The simplest scheme mentions where there is argument transfer and ignores the detail about arguments passed, their number, their datatypes, further information can be specified later within another editing procedure. The next scheme represents as many slots on top of each function (or icon sketch representing the function) as it has of arguments. The slots may be filled with the arguments name which are usually sufficient to know what type of variable is expected. The third scheme uses extra symbols or names to represent the datatype of arguments to be passed in a function called.

          Each function, node in a visual graph representation, is linked to others with oriented arcs which represent argument control flow. In LISP, the function may evaluate its arguments of not (LAMBDA or NLAMBDA), it is possible to use different type of connections (arcs) to visualize this distinction. For example, arcs of size one for evaluated argument, size two for non evaluated, a pipe like arc for unspecified number of arguments.

          The graph editor (GRAPHER and GRAPHICON) can be modified to support the visual representation of arguments as in one of the schemes mentioned.

         

Application to Circuit Design

          The most advanced test of these visual programming methods can be found in the circuit editor which has been designed by Michel Venet (Venet, M. 1986). We had suggested that circuit design would be a good testing ground to explore the possibilities of visual programming available with our icon design methodology (the "icon manager"). Venet started with an earlier version of the icon manager and produced a circuit editor. Although a circuit icon requires special treatment in his implementation, the design of circuits from library circuits can be interpreted exactly in term of icon design (icon structure editing). The figures presented in this paragraph to give an example of circuit design and simulation within the Icon World have been created using Venet's circuit editor.

          In circuit design, the flowing of arguments from one function definition (circuit) to another is equivalent to the placing of a piece of metal conductor (or semiconductor if the circuit editor was extended for CMOS technology layout) and making connections at both end of the conductor. The difference with "argument flow" in the general case of visual function design, is that a circuit connection is usually layed out following more sophisticated design rules. For example the connections may be limited to vertical and horizontal lines of given sizes, connections between two inputs or two outputs are prohibited, etc.

         

The advantage of this (iconic) approach to circuit design is that logical connections are directly inferred from their drawing. This is done by automatically updating a set of icon relations at drawing time, so that there is equivalence between physical connections and logical connections among icons.

          The user may construct a logical circuit using and, or, xor, nand gates, etc. (which are instances of icons defined in a "circuit library") by placing them on a circuit editor window (Figure 71) and drawing (sketch drawing) connections among circuits as if he was wire-wrapping (Figure 72). Menu of commands is also available and can be used for simulation of the circuit drawn by the user (Figure 73). In the simulation of Figure 73, a grey shade represents a logical value 0 and a black shade a logical value 1. The inputs can be entered by the user (mouse selection) and the simulator produces the outputs with or without visualization of the intermediate logical values.

          A circuit definition in LISP can then be generated ("circuit compile") within the circuit editor by exploration of the icon relations. The new circuit is then ready to be included in the library. A set of basic icons can be seen from Figure 71 such as MAKE-CIRCUIT, EDIT-CIRCUIT, FILE-CIRCUIT, etc.; these icons are command icons.


Figure 71. Placing Circuit Icons in the Circuit Editor Window


Figure 72. Drawing Circuit Connections to Design CLK.FULL.ADD


Figure 73. Circuit CLK.FULL.ADD Completed and Used in a Simulation

 


 

For example, editing the circuit CLK-FULL-ADD of Figure 71 has been done by selection of the icon EDIT-CIRCUIT, selection of the sketch representation of CLK-FULL-ADD and validation. CLK-FULL-ADD was originally created using the command icon MAKE-CIRCUIT.

We suggest that a circuit history should be kept so that circuit changes are always possible using the command icon EDIT-CIRCUIT without having to edit the "circuit compiled" LISP definition.

          Conclusion on Visual Icon Management

          The icon oriented approach allows various type of "connections" between objects (icons) to be defined by visual interaction. These are interpreted in accordance with the rules of the procedure applied to perform them (the connections). What matters is that the user is aware each time of the consequences of the connections he draws. To each way of editing a set of icons to construct another one can correspond a special interpretation in terms of icon relations (or "connections") so that physical and logical relations between icons become one.

          For example, it is possible to think of piling up icon sketches on the screen in a way which will permit the construction of a new procedure just as we have reconstructed images from their iconic parts. Simply the rules used to infer the object relations are different: for example, one icon placed underneath the other may mean "use result from above block".

         

The icon interpreter (Chapter II) can largely beneficiate itself from the spatial arrangement of icons on the screen. Imagine the following scenario: when an icon command is created, a set of "empty slots" is generated and progressively filled by arguments that have been selected or that have been typed in. If another command is started (from inside the original one), it is layed out perpendicular to it and, the whole puzzle of commands is displayed, similar to a Scrabble game, until some of the branches resolve and are visually replaced by the iconsketch or the name of their results which are passed on as an argument to the next command.

          The possibilities offered by the use of a display screen which can be organized in about two and a half dimensions (the Icon World) are numerous, and all this is possible in a unified way from the definition of an icon which we have given. Application programs of the icon manager environment can integrate numerous visual programming concepts which apply to functional programming, circuit design, technical design, architecture, naval architecture, CAD/CAM, robotics (simulations, learning mode and control if an efficient hardware architecture is provided), visual problem solving, image processing, image databases and geographical database systems.

         

 


 

 

 

CHAPTER VI

ICON MATCHING AND INFORMATION RETRIEVAL

 
In this chapter, an algorithm to perform the exact matching of two icons is presented which can operate in the context of the icon manager. An algorithm is also proposed to compute a measure of icon matching, or of icon similarity. The case of approximate matching of two icons is then studied.

          Methodology for Icon Matching

          We consider here the problem of measuring similarities between two icons. Several approaches are available to compare the information content of two icons. We first illustrate the icon matching problem using examples.

          When two icons solely containing pictorial information must be compared, it is possible to first reconstruct two images from their iconic representations and then perform an image matching operation similar to MATCHBITMAP described in Chapter III. The result will be a measure of similarity of the two pictures. This method does not use the structural information of the icons but instead, produces two projections in the bitmap plane which can be compared. It may be sufficient for some applications and will be very efficient if the graphical hardware supports bitmap matching (functions BITMAPNIL?, BITMAPCOUNT, etc.).

          When the two icons to be compared contain text information, string matching should be used, when they contain s-expressions of function definitions, or data and procedures, they should be matched properly. As a general rule, matching two multiple media objects (icons) to check their consistency requires that methods for matching each media have been given to the system, and that these methods are consistently applied on each part of the object during the matching procedure.

          A general measure for the matching of two icons (icon1 and icon2) can be obtained by a weighted average of local measures (on sub-icons) of the form:

          d (content-type, sub-icon1, sub-icon2)

          where the distance measure "d" is dependent on the sub-icon content-types (e.g. bitmap, text, lisp expression, image contour description, etc.) and sub-icon1 and sub-icon2 are two sub-icons of the original icons icon1 and icon2 to be matched.

          We define the General Icon Matcher to be a general algorithm which can match two icons (in particular multimedia objects) using matching methods from each media. One can imagine that each time a new icon type is defined, a matching method must be created which will be used by the Icon Matcher (in a similar way, methods are defined to generate hardcopy of multimedia documents).

         

In a "consistent" icon manager (where icon sketches are attributed consistently to icons), we expect the general matching problem to be largely similar to the icon sketch matching problem; this assumption will simplify the task.

          An Algorithm Using Exact Icon Sketch Matching

          A simplified approach to the problem of matching icons (seen as multimedia objects) is derived from the use of icon sketch representation of objects under the following assumption: an object designed in any media (handled by the icon manager, e.g. text, image, sound, structured procedure) has an icon sketch which represent it without ambiguity.

          Icon matching can then be performed using the icon sketch information of each icon instead of its information content. An efficient matching operation is then expected when the graphical hardware supports Bit Block Test functions (an analog of the Bit Block Transfer (BitBlt) operator of Smalltalk which can test bitmap content in parallel). Bit Block Test would be an efficient replacement to the sequential functions BITMAPNIL?, BITMAPCOUNT and BITMAPMEASURE which we have defined).

         

This can be done by the procedure MATCHICON (a sub-procedure of the possible General Icon Matcher):

 

procedure MATCHICON
	arguments (REFNAME ICONNAME) 

          REFNAME <- iconprop INDEXNAME of REFNAME ICONINDEXNAME <- iconprop INDEXNAME of ICONNAME

          if / Icon is Already Indexed / REFNAME equal ICONINDEXNAME then return T else if / Matching of Icon Sketches / (MATCHICONSKETCH REFNAME ICONNAME) then / Recursive Matching on Relations / MATCHFAILURE <- () (for RELATION in (RELATIONNAMES of ICONNAME) do (for RELATIVE in iconprop RELATION of ICONNAME as REFRELATIVE in iconprop RELATION of REFNAME do if / Recursion / (MATCHICON REFRELATIVE RELATIVE) else MATCHFAILURE <- T unless isnull REFRELATIVE until MATCHFAILURE) until MATCHFAILURE finally if MATCHFAILURE return () else return T ) else return () end of procedure MATCHICON

 
The algorithm MATCHICON is in fact written after "manual" experimentation with the matching of several iconic examples produced under the icon manager (e.g. the matching of MAPEXAMPLE and MAPEXAMPLEREADFROMSCREEN following their CHILDLIST structure studied below). It is not implemented as the above procedure which treats in general all the relations defined for an icon.

          The matching of two icons according to the procedure MATCHICON is performed between a reference icon and the icon to be matched (ICONNAME). It is designed to exit at the first incompatibility encountered between the two icon structures, in order to minimize the computation time. If ICONNAME has already been indexed in the iconic environment, it only matches with icons which have the same INDEXNAME property (icon indexing is further described later in this chapter). Otherwise, the two icon sketches are matched and if they are exactly identical, all sub-icons obtained from common relations between the original icons are matched two by two. The matching succeeds only if all sub-icon matching succeed.

          Icon Matching Measures

          The result of the procedure MATCHICON (T or ()) is insufficient for applications where an approximate matching must be used to measure the similarity existing between two icons. For example to retrieve all the maps which have lakes "similar" to Lake1 and Lake2 in Figure 15; the matching must be performed only on the "Lakes" sub-icons and a measure of bitmap matching must be used.

         

The procedure MATCHICON can be modified to define the procedure MATCHICONLIST which returns a list of the results of all the successive (sketch) matching operations performed while exploring the structures of REFNAME and ICONNAME as follows:

 

procedure MATCHICONLIST
	arguments (REFNAME ICONNAME) 

          REFNAME <- iconprop INDEXNAME of REFNAME ICONINDEXNAME <- iconprop INDEXNAME of ICONNAME

          if / Icon is Already Indexed / REFNAME equal ICONINDEXNAME then return T else / Matching of Icon Sketches / splice-in the (two) lists: (list (MATCHICONSKETCH REFNAME ICONNAME) ) / Recursive Matching on Relations / (for RELATION in (RELATIONNAMES of ICONNAME) splice-in the lists: (for RELATIVE in iconprop RELATION of ICONNAME as REFRELATIVE in iconprop RELATION of REFNAME / Recursion / splice-in the lists: (MATCHICONLIST REFRELATIVE RELATIVE) ) ) end of procedure MATCHICONLIST

 

Where "splice-in the lists:" is equivalent to the NCONC function of LISP and returns a list formed by concatenating several lists.

          A weighted average can be formed from the result of MATCHICONLIST and used as a particular measure of similarity specific to a given "type" of icon structure (an icon structure "type" should be defined as: icons matching a specific "variable icon", Chapter II). For example the weight can be null except for the (sketch) matching of objects (sub-icons) having a "Lake" texture. Clearly, a matching measure must depend on the icon structure and content in order to be meaningful. The function MATCHICONLIST can be used as a basis to define more specific (application dependent) icon matching measures.

          Possible Algorithm Implementation. The MATCHICONLISP procedure is now given as an example (using Interlisp notations):

 

MATCHICONLISP
	(LAMBDA (REFNAME ICONNAME RELATION) 

          (NCONC (LIST (MATCHICONSKETCH REFICON ICONNAME) ) (FOR RELATIVE IN (GETICONPROP ICONNAME RELATION) AS REFREL IN (GETICONPROP REFICON RELATION) JOIN (MATCHICONLISP RELREF ICONNAME) ) ))

 

Note that MATCHICONLISP is a simplified version of MATCHICONLIST: it returns a list of all the (sketch) matching operations performed, but

only explores one icon relation at a time (e.g. CHILDLIST).

          MATCHICONLISP can also be written using a more general operator on icon structure:

           
(APPLICON 'MATCHICONSKETCH 'CHILDLIST '(REFICON ICONNAME))  

          where APPLICON applies recursively MATCHICONSKETCH along the icon structure of the relation CHILDLIST of both icons REFICON and ICONNAME. The procedure APPLICON introduced in Chapter III must first be modified so that it will apply a function to several (icon) arguments (e.g. MATCHICONSKETCH) while exploring "in parallel" their icon structures. APPLICON must also be modified to return a list of answers from all subsequent calls to MATCHICONSKETCH (similar to MATCHICONLIST).

          Algorithm Limitations. The two matching operations proposed above are not symmetrical:

1. The list of relations tested is obtained from (RELATIONNAMES of ICONNAME) which is usually different from (RELATIONNAMES of REFICON).

2. Relatives of ICONNAME are matched to relatives of REFNAME only if these exist (relatives of ICONNAME which do not have an "equivalent" in the structure of REFNAME are not considered).

3. To simplify the expression of EXACTMATCHICON procedure, relatives of both icons (for any given relation) were assumed to be placed in identical order beforehand.

          In other words, MATCHICON is performed to check if ICONNAME matches exactly with the reference icon REFICON on their common subset of relatives. This matching can be successful when ICONNAME has relations and relatives that REFICON does not have. More detail on the problem of icon matching can be obtained by considering the following method which should be invoked to reorder the icon relatives prior to applying the MATCHICON or MATCHICONLIST procedure (so that the assumption in 3 above is satisfied).

          Reordering Relational Expressions. Consider the list of (iconic) relational expressions of ICONNAME:

         

         

  
(a (a' a' ... a') b (b' b' ... b') ... ) 1 2 N' 1 2 M'
 

         

          where a, b, c, ... are the relation names and the sublist next to a relation is the list of relatives (icons) linked to ICONNAME by that relation. Consider also the list of relational expressions of REFICON which has been restricted to the same set of relations as ICONNAME:

         

         

  
(a (a a ... a ) b (b b ... b ) ... ) 1 2 N 1 2 M
 

         

         

We can define two sets of reordering functions for the sets of relatives: r, s, t, ... and r', s', t' for REFICON and ICONNAME respectively:

         

         

  

          r : [1, N ] ---> [1, NO]

          r': [1, N'] ---> [1, NO]; NO <= N + N'

         

          s : [1, M ] ---> [1, MO]

          s': [1, M'] ---> [1, MO]; MO <= M + M'; etc.

 

         

          Moreover, r', s', ... can be chosen so that the NO - N', MO - M', ... last relatives are the null element (represented by "-"). The reordered relational expressions become:

         

         

  
(a (a a ... a ) r(1) r(2) r(NO)  
b (b b ... b ) ... ) s(1) s(2) s(MO)

           
(a (a a ... a - - - ) r'(1) r'(2) r'(NO-N')

           
b (b b ... b - - - ) ... ) s'(1) s'(2) s'(MO-M')

 

         

          The matching of ICONNAME and REFICON can be done from these two reordered lists by matching two by two the icon sketches and then applying recursively the reordering procedure to the sub-icons in a procedure similar to MATCHICON.

          After a successful matching, ICONNAME can be used to complete the reference icon structure by bringing in information on relatives of ICONNAME which were unknown to to the environment before.

          Indexing of Icons. Since the matching operation can be costly, it is interesting to keep references between icons which have been successfully matched (in particular each time a new object is presented to the iconic environment and matched). A lot of processing time can be saved by replacing physical objects by logical structures, a lot of memory space can also be saved since similar icon sketches need to be saved only once if the information to perform changes with respect to an original sketch (the sketch of the first one indexed: REFNAME) are saved for all the others.

          In particular, a consistent iconic environment (an iconic environment with complete and consistent indexing of all icons) can be an excellent support to approach problems of artificial intelligence dealing with object manipulation, scene analysis and in particular, to support a "visual" model for object manipulation in robotics (Winston, P. H. 1984), (Ballard, D. H. 1982): the icons and icon relations (relations among objects) constitute the required database to approach these problems.

          In the present icon manager environment, INDEXNAME represents the internal name of the icon (the property HIDDENNAME is in fact used), which may be different from the icon name visible on the graphics display (window title). This mechanism was first used to allow creation of icon instances: icons which look exactly like the original icon but must be saved separately. Indexing can be used among similar icons as we mentioned above, but this has not been implemented. The indexing operation is equivalent in terms of icon properties ("relations") to the procedure INDEXICON:

 

          procedure INDEXICON arguments (REFNAME ICONNAME)

          (addiconprop INDEXEDCOPY of REFNAME value (listof ICONNAME (ICONCOPYCHANGES REFNAME ICONNAME))) (puticonprop INDEXNAME of ICONNAME to value REFNAME)

 
Example of Icon Matching. In Figures 16 and 23, two graphs of the icon structures of MAPEXAMPLE and MAPEXAMPLEREADFROMSCREEN are presented. The comparison of these graphs and (or) of the icon (and sub-icon) sketches may be interpreted as a final step in the process of recognition of MAPEXAMPLE by the iconic system:

          Following the definition of Icons such as Lake, Forest, City and Highway, it is possible to match the texture of the objects read back from the screen (by READREGIONFROMSCREEN) and to decide to which one of these classes each "object read" belongs. Therefore, we can consider that a graph such as the one in Figure 23 has been automatically generated by the "vision system" of the icon manager by simply "looking" at (procedure READREGIONFROMSCREEN and a few others) the bitmap image of MAPEXAMPLE (Figure 15). The ordering of subicons (LakeRead1, LakeRead2, etc.) can be expected to always be the same if a given scanning procedure is used on the bitmap of MAPEXAMPLE but this is not required if relation list reordering is used in MATHICON. In fact, the original icon MAPEXAMPLE has already a complete iconic structure which has been automatically produced when the user edited MAPEXAMPLE (the sketch: Figure 1) using the procedure ICONICMAPDEMO (successive results are in Figure 2 to 15).

          Matching MAPEXAMPLEFROMSCREEN to MAPEXAMPLE (the reference icon) is equivalent to matching the arcs of the two graphs of Figures 16 and 23 (they represent CHILDLIST relations) and matching (MATCHBITMAP) two by two the icon sketches. The procedure MATCHICON would fail in the present example since READREGIONFROMSCREEN was not (yet) able to read back the highways (highways are missing in the graph of Figure 23) and the icon MAPEXAMPLE has already information on highways (Figure 16). If a measure of matching was used instead (obtained from the result of MATCHICONLIST) we would expect a high score (> 90% when matching MAPEXAMPLEFROMSCREEN to MAPEXAMPLE), since the icon sketches of all (existing) sub-icons are very similar, except for the error on the boundary of the icons read (due to the present limitation of the procedure REGIONGROWER noticed in Chapter IV).

          Approximate Sketch Matching: Hardware Limitations

          The procedure MATCHICONSKETCH in the algorithm MATCHICON has been left undefined on purpose: while experimenting "manually" with icon matching, we have always used one of the operators BITMAPNIL?, BITMAPCOUNT or BITMAPMEASURE (presented in Chapter III). In practice, a few lines of code have to be added to do icon sketch matching: the relative position of the sketches must be compared (using reference coordinates from the icons REFNAME and ICONNAME respectively). It is important for some applications (e.g. geographical databases) to match the object content as well as the object location.

          The approximate matching of the content of two different icon sketches is obtained by using (BITMAPMEASURE (XORBITMAP REFNAME ICONNAME)) which returns an integer number (the ratio of the bitmap area over the number of different bits between REFNAME and ICONNAME). (BITMAPNIL? (XORBITMAP REFNAME ICONNAME)) should be used to compare exactly two bitmaps (returns T of ()).

          The approximate matching of icon sketch positions is straightforward to implement using the above operators on bitmaps, but may be too time consuming. The execution time for the matching of two bitmaps using BITMAPCOUNT or BITMAPNIL? is comparable to the timing of local bitmap transfer (Figure 70): approximately 30 seconds are required for a large bitmap (512x511). Therefore, matching two large bitmaps with a region of uncertainty on the sketch positions of the size 16x16 (e.g. (-8 -8 16 16)) (and using any algorithm calling BITMAPCOUNT) results in the worst case (matching fails in MATCHICON) in a computation time of 256x30 seconds: more than two hours!

          Implementation of the approximate sketch (bitmap) matching (with approximate positioning) is expected to be much more efficient on top of a specialized hardware architecture (Chapter VIII).

         

 


 

 

 

CHAPTER VII

MULTIMEDIA COMMUNICATION AND ICONS

 
In this chapter, simplified protocols which can be used to transmit icons between two lisp machines on an ethernet link are explored. Applications to progressive transmission of structured icons, multimedia messages and real time teleconferencing are also discused.

          Icon transmission over a local Ethernet link

          Icon File Transfer A straightforward (but time consuming) way to solve the problem of icon transmission between two workstations requires the use of a file transfer protocol (XNS, PUP or IP) in conjunction with the icon file system described in chapter I.

          A request for a specific icon file is emitted by machine A to machine B. If the icon has been recently modified on B side, B first remakes the icon file and then send an OK signal to station A. Station A then loads the icon file inside its current environment and finally performs a (MAKEICON ICONNAME) to bring the icon on the screen. If the icon contains bitmaps which cover a quarter of the display screen, the whole operation of refreshing an icon from one machine to the other can take more than a few minutes.

         

From experience with the Interlisp-D environment, it appears that the most time consuming part of the file transfer method is writing and reading bitmaps to and from a file (this has to be done sequentially while other operations can use Bit Block Transfer (BITBLT)). Therefore, when the icon transmitted is not constituted of any large bitmap but rather of sketch descriptions, the icon transmission using the file transfer method may well be acceptable.

          The Evalserver To test the capability of transfering visual information on an ethernet link between two 1108 workstations, another approach can be used: a procedure which transmits directly the property description of an icon from one machine to the other can be designed using the Evalserver protocol (Xerox Artificial Intelligence Systems, 1985). The evalserver can be used to evaluate commands from one machine into the context of another machine (function REMOTEVAL). It can used in particular spy into the information content of an icon defined on station B and reproduce it on the screen of station A. Note that it may not be necessary to load the whole content of an icon into the A environment.

          A function (MAKEICONCOMS, chapter II) has already been written which returns a procedure ICONNAMECOMS (list of commands) to save on a file the relational properties and bitmaps necessary to reconstruct an icon (and some of its relatives). We suggest to use it for icon transfer (but we did not test it in that context) as follows: to make an icon file each command from ICONCOMS is evaluated and the result printed in a file, this result should be redirected through the ethernet link and loaded into the other machine. In other words, the ICONNAMECOMS just have to be REMOTEVALED by machine A in order. A transmission procedure can then operate in three steps: first machine A does a remote eval of MAKEICONCOMS, second it uses the commands ICONNAMECOMS returned, remote eval each one of them and load (LOAD) the result in its environment, third it performs a MAKEICON on the icon name to display the icon transfered.

          Since information obtained by remote eval must fit a packet of size less than 534 bytes due to a current limitation of the Evalserver a special procedure had to be written first for bitmap transfer (and the above method has not been further developed).

          Nevertheless, it is clear that icon transfer can be done using the Evalserver protocol as follows:

1. Transfer the sketch description of the icon (if bitmap, use the bitmap transfer function).

2. Use a call to REMOTEVAL to transfer the property list of the icon (this also transfers the names of the icon relatives).

3. Apply recursively the transfer method (steps 1 and 2) to each relative of the icon.

          Transfering the sketch representation of the icon first (step 1) allows the icon to be directly displayed (from its sketch) on machine A, while the sketch information of the first layer of subicons is being loaded (step 2) and will be displayed next. This procedure allows progressive transmission of a structured icon (e.g. Figure 15).

          Bitmap Transfer

          The bitmap transfer procedure is now presented.

          The basic function which is necessary to perform icon transmission on an ethernet link is presented in a demonstration package (file: ICONTRANSFER) where the function GETREMOTESCREENPIECE allows one user to select any part of the screen from machine A and automatically recreates it on the screen of machine B. The information transmitted at this level is limited to a bitmap (bitmap transmission is probably the most costly operation). The file ICONTRANSFER also includes three functions which have been used to test the efficiency of bitmap transfer between two 1108 lisp machines: PACKETSIZETEST1, PACKETSIZETEST2 and PACKETSIZETEST3.

          Note: if the icon is known in terms of a sketch description (contours) or region filling description, it is clear that this type of information is more efficiently transmitted. This method may be preferred to bitmap transfer when the reconstruction time is not too long (Chapter IV).

         

The transfer of a remote piece of screen is performed by the procedure REMOTESCREENPIECE:

  

procedure REMOTESCREENPIECE argument (CLIENT) HOST <- (ETHERHOSTNUMBER) REMOTEINFO <- (REMOTEVAL '(SELECTSCREENANDMETHOD HOST) CLIENT)

          HOSTSCREENPIECE <- eval CREATEPART of REMOTEINFO REMOTESCREENMETHOD <- METHOD of REMOTEINFO NUMBEROFTRANSFERS <- TRANSACTIONS of REMOTEINFO

          start NUMBEROFTRANSFERS processes in a ring of processes PROCESSESRING using (RECONSTRUCTSCREENCHUNK HOSTSCREENPIECE (REMOTEVAL '(REMOTESCREENMETHOD) CLIENT))

          when all processes FINISHED, return (open HOSTSCREENPIECE)

          end of procedure REMOTESCREENPIECE

 

Station A which request a piece of screen to station B has the ethernet number HOST and station B, the ethernet number CLIENT. At first station A ask station B to perform (in its own environment) the procedure SELECTSCREENANDMETHOD, which prompts the user on B side for the selection of a part its own screen for transfer. The procedure SELECTSCREENANDMETHOD then creates a function definition on B side which contains all information necessary to complete the data transfer, it returns to station A a form containing:

1. An expression that A must evaluate to create a datatype (HOSTSCREENPIECE: icon, window or bitmap) identical to the one being originally selected on machine B.

2. The name of the function (METHOD) created on B side to insure the data transfer (REMOTESCREENMETHOD).

3. The number of processes (NUMBEROFTRANSFERS) which must be started on A side to insure complete transfer.

          Station A then starts NUMBEROFTRANSFERS processes which can be executed in any order and will restore the content of the selected bitmap into HOSTSCREENPIECE. Processes are maintained active in a limited number in PROCESSESRING which is a set of processes handles placed in a ring buffer. The whole transfer is completed when all NUMBEROFTRANSFERS processes have been executed. Station A then brings up the final result on the screen.

          Experimental Results

          Two algorithms very similar to the bitmap transfer procedure presented above have been implemented and tested between two Xerox stations connected by an ethernet link to transfer the bitmap content of icons or "snapped" pieces of screen, they can be found in the file ICONTRANSFER: PACKETSIZETEST1 and PACKETSIZETEST3. Bitmaps of varying sizes between 128x128 and 512x512 have been exchanged this way. The results are summarized in Figure 74 and 75 respectively.


Figure 74. Timing Graph for Bitmap Transfer on Local Ethernet Using Method 1


Figure 75. Timing Graph for Bitmap Transfer on Local Ethernet Using Method 3

 


 

The first algorithm does not create a ring of processes to reconstruct a bitmap from remote evaluation of REMOTESCREENMETHOD (the size of PROCESSESRING is zero). It is therefore waiting for each remote evaluation to be completed to start bitmap reconstruction. The transmission delay may seem too long in that case, but this is also due to the overhead required by the evalserver protocol; that the bitmap transfer has to be done twice in the optimal case: once by REMOTESCREENMETHOD and once by Evalserver to fill in the packets transmitted.

          This optimum is closely approached by the second algorithm where a ring of four (4) processes is used to perform bitmap reconstruction after remote evaluation has succeeded. On Figure 75, the time for transfer of a remote bitmap is approximately twice the time for transfering a bitmap locally and this for any bitmap size. The transfer time is also proportional to the bitmap size. Note: a procedure using a ring of processes to do remote evaluation and reconstruction of bitmap at the same time results in an overloading of the Evalserver when transmitting large bitmaps (512x512). Overloading the Evalserver usually results in some "ugly break" (TIMEEXPIRED) from which it is not always possible to conclude the transfer (this part of the evalserver should be "advised" a better behavior).

          A bitmap is transformed into a string datatype on machine B which is then sent over with its origin coordinate (in the bitmap) and its length so that reconstruction can occur on machine A in any order. The string size is limited to 512 so that the whole packet does not exceed the magic number 534, but it can be adjusted.

          In Figures 74 and 75 a comparison of transfer times for various packet sizes is presented. The transfer time of bitmap within the same machine is also presented for comparison.

          Future Work: Application to Multimedia Messages

          An experimental protocol for sharing bitmaps (Blink) has been developed at MIT by David Reed. This work is referred to by Sarin (Sarin, S. et al. 1985). Sarin presents an extended version of Blink, MBlink running on a DEC VAX mainframe and providing a shared bitmap environment between participants of a "bitmap conference" on Xerox Palo Alto workstations. MBlink basically provides real time update of a bitmap as well as remote cursors updates on each workstation with a time lag of about half a second.

          Although the demonstration of bitmap transfer between workstations presented above is not operational in real time, it can be used to update a bitmap in real time. The transfer of raw bitmaps can also provide interesting information communication between independent workstations on an ethernet link and should be combined with a bitmap sharing system (such as MBlink) for computer conferencing.

          When the method proposed to transfer icons is completed, the exchange of composed messages (multimedia) will be possible. We suggest for future work with the icon transfer protocol, the creation of a multimedia mail service, the remote counseling of students, the organization of conferences with exchange of text, lisp functions or images combined as structured icons. The adjunction of camera vision systems to two Xerox 1108 workstations (Vidicon camera, Epix image interface and processing board, and Xerox PC compatible BUS Extender board) is awaited to experiment with image transfer and teleconferencing under the icon manager.

          The icon manager could add to the experimental teleconferencing system available at IIT and written by Dr. Howard Dreizen for IBM PC compatible, an extended windowing capability which is not yet available with any workstation (including definition of structured objects with connected components of any shape, useful to transmit specific parts of the screen: formula, piece of text and data). The selection of a part of the screen (of any shape) for transmission during a teleconference was first proposed by Dreizen, it has an elaborate equivalent in the icon manager; a transparent icon (Chapters II and IV) which can be created by cursor drawing on the screen or using a set a preselected icon masks.

          Finally we suggest the use of a general partitioning of the screen by a structured icon for efficient teleconferencing with multimedia information: sub-icons are defined for specific information capture from user input (input icons for text, drawings, data, lisp functions) image input icon (running a connection process with the camera and vision system) and for remote output (information) display, a similar sub-icons system is used. The teleconferencing session begins with the users exchanging their communication icons which display an example from a previous conference. The teleconference itself consist simply in updating the content of any one of the remote input icons.

         

 


 

 

CHAPTER IX

CONCLUSION

 
A new approach to define and design icon-oriented software system has been presented. New concepts (DOWHATIDRAW, DRAWWHATIMEAN (Chapter II), icon matching and retrieval (Chapters III and VI)) have been introduced and need to be further explored. They will make icon-oriented programming a very powerful environment to be used on today's lisp machine as well as the fifth generation of computer systems.

          The Icon Manager has been described as a visual programming environment which can easily be modified by visual interaction to satisfy the user's needs: new icons can be created to perform actions on icons, new menus of actions can be constructed and attached to process icons and new functions (to operate on icons) can be designed and edited using icon graph programming, or visual functional graphs. A particular application of visual functional graphs to the design of electonic circuits has been presented.

          Several advantages derived from the definition of an icon, basis of this work, are summarized by the following:

1. The duality concept of abstract object and visual representation calls for new methods to combine and access information. For example, retrieving information in a generalized iconic environment is equivalent to recursively matching icon sketches while exploring the icon structure; it is also equivalent to the subgraph isomorphism problem of the graph theory. Icon matching will also permit icon (information) retrieval from visual queries within the iconic environment.

2. The proposed icon definition allows the creation icons which are multimedia documents and can be exchanged between two (or more) workstations for multimedia communications and computer conferences.

3. The possibilities offered by visual interaction with the Icon World are multiple. In the Icon World, icons can be organized in a program design space of dimension two and a half where association between icon physical relations and and logical relations is directly possible. Application programs of the icon manager environment can integrate numerous visual programming concepts which apply to functional programming, circuit design, technical design, architecture, naval architecture, CAD/CAM, robotics (simulations, learning mode and control if an efficient hardware architecture is provided), visual problem solving, image processing, image databases and geographical database systems.

4. Decomposition of images into subimages within hierarchical (icon) structure is also supported by the icon system and may find applications in image compression, progressive image transmission, image understanding and computer vision systems.

          The following presents several research orientations which may lead to the design of larger information systems based on the concept of icon and the design of icon expert systems. The possibilities to design a computer system specialized for the processing of icon data structures by a software design approach (GRAPHLOG) as well as a hardware design approach (pyramids of processors) are also surveyed.

         

Information Storage and Retrieval in an Icon System

          Expert Icon, Icon Expert System. We define an expert icon as the smallest element of an icon expert system (an expert system on top of an iconic system) which maintains consistent relational information among icons. Icons can be assimilated to nodes of an overall information graph (Chapter II) (e.g. Figures 16 and 23) which represent the total knowledge of an icon system. In this context an expert icon is any icon which can represent a hypernode of the icon system graph: a subgraph of the information graph (e.g. Figures 16 and 23).

          In other words, an icon becomes an expert icon as soon as it has been integrated as a subgraph of the expert information graph. This means that all relations between this icon and other icons previously in the environment (or yet to be inserted) have been checked out and recorded and that the icon (with all its relatives) has been attributed a unique place inside the information graph.

          Each time a new icon is presented to an icon expert by one of its sensors (camera and vision system, speech recognition system) or transfered over a transmission network or even loaded from a file server system, its logical graph must be properly compared to other nodes of the expert information graph. This matching operation is performed to decide if the information brought by the new icon is already in the expert (in which case we expect it to return the node name or to present a sketch) or if it should be incorporated inside the information graph (and where) or simply rejected as inconsistent.

          Every step in the construction of an expert system using icons relies on the ability of the icon manager to compare two icons and decide if they are exact copies of each other or if one should be used to complete the other or (and this is more delicate to define) if one can be transformed to look like the other.

          GRAPHLOG: Programming in Graph Logic

          General algorithmic techniques are available for solving graph matching problems such as finding graph isomorphism and subgraph isomorphism (Ballard, D. H. 1982), and can be applied to icon matching.

          In fact, an icon system (where each icon is defined with a consistent set of relations to other icons) can be interpreted as a Semantic Network. It is then possible to imagine a system which is based on logical inferences as the ones which can be derived from icon matching. We will call such a programming environment, GRAPHLOG. It can be thought of as an extension of the Prolog environment (Clocksin, W. F. et al. 1984). In a Prolog environment, a flat database of facts and clauses (and procedures) is layed down and processed by a motor of inference to derive new facts. A Graphlog environment would proceed similarly but facts would be expressed directly as graph nodes in a Semantic Network overall database, inference would be done by graph matching.

          A Graphlog environment is impossible to imagine without a strong graphical support, because of the complexity of graph representations. It is also impossible to imagine in a "flat" predicate calculus system (such as the usual Prolog environments) where any Semantic Network has to be broken down into a list of (linear) clauses.

          Finally, a Graphlog environment, which is the (future) ideal support to the design of icon expert systems, can probably be approached by an integration of Smalltalk in Prolog, or more closely (and related to our work): an icon manager and an icon matching inference system. It should be noted at this point that icon matching and graph matching are very suitable for parallel execution.

          The current effort in constructing parallel prolog machines may well be interpreted as the reconstruction of graphs of clauses which can be resolved in parallel (from the originally flat database of clauses). The "icon expert" approach treats directly every object of the icon manager environment as a graph structure which copies as closely as possible the internal structure of the original object (so that an image similar to the object is manipulated internally by the iconic system), the information on parallelism (possible parallel execution of operators) is then conserved. Such a system will find an ideal hardware support with computer architectures based on large arrays or pyramids of microprocessors.

          Quadtree Representation of Images and Pyramid of Processors

          From 1980 to 1984, several interesting results have been produced on quadtree decomposition of images (and octtrees for three dimensional objects) and image operations performed on quadtrees (Dyer, C. R. et al. 1980), (Jackins, C. L. et al. 1980), (Jones, L. P. et al. 1984), (Rosenfeld, A. 1984), (Samet, H. S. 1980) and Samet in (Samet, H. S. 1984) gives a detailed survey of several quadtree based algorithms. This sudden emergence of results on quadtree structures seems to have been largely supported by a common interest by the same research groups on pyramidal architectures of processors for image analysis, encoding and synthesis (Ahuja, N. et al. 1984), (Mago'', G. A. 1979) and (Tanimoto, S. L. 1983).

          Quadtrees have their analog in multiprocessor architecture which would allow computers built with this special type of architecture to perform several important operations in real time on large arrays of images: progressive query evaluation in geographical databases, multilevel image abstraction, regions from quadtree, boundary code from quadtree and progressive image transmission between two workstations.

          We found many of the results published in this domain to be very interesting: the distance between hierarchical structure of images and iconic hierarchical systems did not seem long to us, and the choice of the icon structure which is the basis of the icon management system described in Chapter II was certainly influenced by our previous interest in hierarchical structures of images.

          In a term paper presented by the author in May of 1984, "From a Tree of Processors to a Pyramid of Processor" (not published), a network of microprocessors executing a reduction language was studied (Mago'', G. A. 1979). The reduction language which can be directly "reduced" (executed) by Mago''s network of microprocessors is a subset of the famous Functional Programming Language (FP) presented by (Backus, J. 1978) at the Turing Award.

          We have shown in the paper mentioned above, that the tree of processors defined by Mago' was able to execute several LISP instructions in time N (N = length of the list) when time NxN was required in a stack environment; the same network was also theoretically extended to a pyramid of processors (planar communication between processors was added) and appeared to be an excellent support for several basic image operations. A similar multiprocessor architecture for image processing was found later in (Ahuja, N. et al. 1984) and can be compared to a quadtree structure, but little detail was given in this later work about communication between microprocessors and network partitioning. This topic has been extensively studied by Mago' in the case of a binary tree of processors and makes possible the extrapolation to a pyramidal network, which can to process images as well as a reduction language.

          The fact that the same computer architecture can efficiently execute (in parallel) expressions from a reduction language (a language "similar to LISP" is a sufficient definition for the current discussion) and perform operations on hierarchical structure of images is of great interest to us. Many operations which are required in an iconic system are too time consuming for a Von Neumann architecture and require a high degree of parallelism (bitmap matching, pattern matching in a bitmap, general procedure for icon matching, etc.).

          The major argument defended by Backus, (Backus, J. 1978) is that the parallelism of an algorithm must be naturally expressed directly in the programming language rather than decoded by a compiler after "encryption" inside a regular sequential procedure.

          Several operators which can be applied to icons in the icon manager environment are suitable for parallel execution. It should be noted that clearly defining the structure of a (multimedia) object (as an icon structure) has the advantage of making clear the possible parallel processing of the object. As a matter of fact, the operator APPLICON mentioned in Chapter II which is used to recursively apply an operator (e.g. OPEN, CLOSE, MOVE, SAVE, TOP, EDIT) to each node in the icon graph structure, is a perfect example.

          The need for a highly parallel architecture to support the design of a large icon manager (accessing a large database of icons) system comes from the fact that such a system will have to explore large quantities of pictorial objects as well as complex object structures. The hardware to support a large sized icon manager must, as a minimum requirement, be able to process, analyze, encode, transmit, synthesize large images in real time. It must also support operations on complex objects such as largely connected graphs or multimedia objects (icons).

          Todays lisp machines seem to be an excellent environment to support a small size icon manager system. A lisp machine executes very efficiently (but in a stack or a "spaghetti" stack) several instructions which are similar to those of a reduction language it can also efficiently process rectangular bitmap arrays (BITBLT operator) using a specialized graphic hardware. We have encountered while experimenting with the icon manager some limitations of our (small sized) workstation which makes it unsuitable to the design of examples larger than the ones presented. The main limitation is our 10 Mbyte hard disk which is absolutely inadequate, another limitation is regarding icon matching (Chapter VI) and we suggested then, the design of a Bit Block Test set of functions so that further experimentation with icon expert systems will become possible.

         

Pyramidal Multiprocessor Architecture

          A network can be designed by extension of Mago''s Network which is similar to a quadtree of processors. It can be seen as a pyramid of processor-planes with decreasing sizes (divided by 4 at each level). Each plane of processors processes an image plane: a processor is connected to its four nearest neighbors (to allow region shifting for example), a processor from an image plane is also connected to four processors from the image plane at the next lower level. Each processor or "cell" in the pyramid is the equivalent four of the "T-cells" defined by Mago', it has the following elements:

1. Sixteen internal microprocessors which can be regrouped in a small number of manners (height in the case of a tree of T-cells with four microprocessors) to insure that all the possible local partitioning of the pyramid the next levels are possible.

2. Twenty communication registers (sixteen for communication with its sons and four for communication with its only parent). Four communication channels are needed per son (or parent) to insure communication in all possible case of processor partitioning.

3. Four communication channels are added to insure communication with nearest neighbors in each image plane.

          The microprocessors used in the design of each "cell" have internal memory.

          When executing (reducing) an expression from a reduction language (comparable to some of the usual LISP-expressions), Mago''s Network proceeds in partitioning the expression into Reducible Applications (RA) (an expression which can be executed without further variable substitutions); the tree network is then partitioned from each RA up to the top so that an RA has a virtually independent architecture of processors dedicated to itself at execution time. Each RA is then executed independently and in parallel with the others.

          Exactly the same principle of reduction can be applied to the extended version of Mago''s Network, the Pyramidal Network, except that RAs have a two dimensional pattern instead of being lists of various sizes: they look like regions obtained by quadtree decomposition of an image. For better efficiency on this special quadtree architecture, a reduction language operating on arrays of various shapes (they do not have to be rectangular) could be designed. If the user does not like to use a reduction language with two dimensional expressions, the Pyramidal Network can be used to reduce one dimensional RA(s) as before. Since each cross-section of the pyramidal network is equivalent to one of Mago''s Networks (tree), it is clear that the pyramidal architecture proposed can execute reduction languages.

          When applied to image processing, a two dimensional RA can be interpreted as an operation to be performed on a restricted region of the image which is defined as the area under the RA. A closer study of the partitioning methods for the pyramidal architecture is beyond the scope of the present work, but we can derive from Mago''s results in the case of a binary tree that similarly, a partitioning of the whole quadtree (using the type of processing cell defined above) is possible so that each RA (whatever its shape is!) can be executed independently of the others. Each of these RA may represent any connected region from the image plane.

          Therefore, the following operations (which can be expressed in terms of reducible applications (RA)) can be formed to process images stored in internal image planes of the pyramid (quadtree) of processors and are expected to have very efficient performances:

1. Thresholding on the whole image plane or any particular region.

2. Identification of monochrome or mono-texture regions throughout the image plane: an entirely parallel execution of a region growing algorithm. Note: once regions have

been identified, the partitioning of the quadtree should be conserved and updated by successive operators.

3. Changes of texture or color in any particular region.

4. Any local logical or arithmetical operation applied to one image or one region of an image or between two images or two specific regions of these images (e.g. and, or, xor, average, digital filters, etc.).

5. Region intersection and union between images.

6. Region shifting, rotation and other supports for iconic animation.

7. Counting elements belonging to a region (area measurements and volume measurement in the case of an octtree).

8. Matching of images or specific regions of images as a result of 4 and 7 above.

9. Matching with approximate location of one object respectively to the other (Chapter VI) as a result of 6, 4 and 7 above.

10. Any meaningful combination of the above operations.

          The fact that both of the qualities which make a lisp machine adequate for small scale icon oriented applications, (e.g. visual programming systems, multimedia editing and multimedia message exchange) are also found in the pyramidal hardware architecture derived from Mago''s work, makes this type of architecture very attractive to us. Such an architecture seems to be an interesting candidate for the design of iconic machines and the support of iconic expert systems.

         

         

 


 

 

BIBLIOGAPHY

Ahuja, N. and Swamy, S. 1984. Multiprocessor Pyramid Architectures for Bottom-Up Image Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, No. 4, July 1984, pp. 463-475.

Backus, J. 1978. Can Programming be Liberated from the Von Neumann Style? A Functional Style and Its Algebra of Programs. Communications of the ACM, Vol. 21, No. 8, August 1978, pp. 613-641.

Ballard, D. H. and Brown, C. M. 1982. Computer Vision. Prentice Hall, Englewood Cliffs, New Jersey, 1982.

Barendregt, H. P. 1984. The Lambda Calculus: Its Syntax and Semantics. North-Holland, Amsterdam, The Netherlands, 1984.

Brown, G. P., Carling, R. T., Herot, C. F. and Kramlich, D. A. 1985. Program Visualization: Graphical Support for Software Development. IEEE Computer, Issue on Visual Programming, Vol. 18, No. 8, August 1985, pp. 27-35.

Chang, N. S. and Fu, K. S. 1981. Picture Query Languages for Pictorial Data-Base Systems. IEEE Computer, November 1981, pp. 23-33.

Chang, S. K., Jungert, E., Levialdi, S. and Tortora, G. 1983. IPL - An Image Processing Language and Programming Environment. Proceedings of the IEEE Workshop on Languages for Automation, November 9-11, 1983, Chicago, Illinois, pp. 78-84.

Chang, S. K. and Liu, S. H. 1984. Picture Indexing and Abstraction Techniques for Pictorial Databases. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, No. 4, July 1984, n. pag. In xerograph.

Chang, S. K. and Clarisse, O. B. 1984 The Interpretation and Construction of Icons for Man-Machine Interaction In An Image Information System. Proceedings of IEEE Workshop on Languages for Automation, New-Orleans, November 1984, pp 38-45.

Chang, S. K. 1985. Image Information Systems. Special Issue of the proceedings of the IEEE on Visual Communications Systems, April 1985, n. pag. In xerograph.

Chock, M., Cardenas, A. F. and Klinger, A. 1981. Manipulating Data Structures in Pictoral Information Systems. IEEE Computer, November 1981, pp. 43-50.

Clarisse, O. B. and Chang, S. K. 1985. An Icon Manager in LISP. Proceedings of the IEEE Workshop on Languages for Automation, Cognitive Aspects in Information Processing, Universitat de Palma, Palma de Mallorca, Spain, pp. 116-139.

Clocksin, W. F. and Mellish, C. S. 1984. Programming in Prolog. Springer-Verlag, New York, second edition, 1984.

Dreyfus, H. L. 1972. What Computers Can't Do: A Critic of Artificial Reason. Harris-Row, New York, first edition, 1972.

Dyer, C. R., Rosenfeld, A. and Samet, H. 1980. Region Representation: Boundary Codes from Quadtrees. Communications of the ACM, Vol. 23, No. 3, March 1980, pp. 171-172.

Forsdick, H. C., Thomas, R. H., Robertson G. G. and Travers V. M. 1984. Initial Experiment with Multimedia Documents in Diamond. Proceedings of the IFIP 6.5 Conference, May 1984, n. pag. In xerograph.

Frank, A. J., Daniels, J. D. and Unangst, D. R. 1980. Progressive Image Transmission Using a Growth-Geometry Coding. Proceedings of the IEEE, Vol. 68, No. 7, July 1980, pp. 897-909.

Fu, K. S. 1974. Syntactic Methods in Pattern Recognition. Academic Press, 1974.

Fu, K. S. and Mui, J. K. 1981. A Survey on Image Segmentation. Pattern Recognition, Vol. 13, 1981, pp. 3-16.

Glinert, E. P., Tang, M. D. and Tanimoto, S. L. 1985. A Three-Dimensional Graphical Paradigm for Representing Programs. Technical Report 85-07-01, Department of Computer Science, Univerity of Washington, July 1985.

Glinert, E. P. and Tanimoto, S. L. 1984. Pict: An Interactive Graphical Programming Environment. IEEE Computer, November 1984, pp. 7-25.

Goldberg, A. 1984. Smalltalk-80, The Interactive Programming Environnment. Addison Wesley Series in Computer Science, Reading, Massachusetts, 1984.

Goldberg, A. and Robson, D. 1983. Smalltalk-80, The Language and its Implementation. Addison Wesley Series in Computer Science, Reading, Massachusetts, May 1983.

Herot, C. F. 1980. Spatial Management of Data. ACM Transactions on Database Systems, Vol. 5, No. 4, December 1980, n. pag. In xerograph.

Hofstadter, D. R. 1980. Godel, Escher, Bach: An Eternal Golden Braid. Vintage Books, New York, September 1980.

Horowitz, S. L. and Pavlidis, T. 1976. Picture Segmentation by a Tree Traversal Algorithm. " Journal of The Association for Computer Machinery, Vol. 23, No. 2, April 1976, pp. 368-388.

Institute of Electrical and Electronics Engineers, Inc. 1985. IEEE Spectrum, Applications: Display Technologies. IEEE Press, Vol. 22, No. 7, July 1985, pp. 52-73.

Jackins, C. L. and S.L. Tanimoto, S. L. 1980. Octtrees and their Use in Representing Three-Dimensional Objects. Computer Graphics and Image Processing, Vol. 14, 1980, pp. 249-270.

Jacob, R. J. K. 1985. A State Transition Language for Visual Programming. IEEE Computer, Issue on Visual Programming, Vol. 18, No. 8, August 1985, pp. 51-59.

Jarvis, J. F., Judice, C. N. and Ninke, W. H. 1976. A Survey of Techniques for Display of Continuous Tone Pictures on Bilevel Displays. Computer Graphics and Image Processing, Vol. 5, nd., pp. 13-40.

Jones, L. P. and Iyengar, S. S. 1984. Space and Time Efficient Virtual Quadtrees. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, No. 2, March 1984, pp. 244-247.

Kramlich, D. 1984. Spatial Data Management on the USS Carl Vinson. Database Engineering, Vol. 7, No. 3, September 1984, pp. 10-19.

London, R. L. and Duisberg, R. A. 1985. Animating Programs Using Smalltalk. IEEE Computer, Issue on Visual Programming, Vol. 18, No. 8, August 1985, pp. 61-71.

Lu, S. Y. 1984. A Tree-Matching Algorithm Based on Node Splitting and Merging. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6, No. 2, March 1984, pp. 249-256.

Mago', G. A. 1979. A Network of Microprocessors to Execute Reduction Languages. International Journal of Computer and Information Science, Part 1, Vol. 8, No. 5, May 1979, pp. 349-385, and Part 2, Vol. 8, No. 6, June 1979, pp. 435-471.

Mandelbrot, B. 1983. The Fractal Geometry of Nature. W. H. Freeman and Company, New York, 1983.

Mandelbrot, B. 1984. Les Objets Fractals. Flammarion, Nouvelle Bibliote`que Scientifique, France, 1975, deuxie`me edition 1984.

Moriconi, M. and Hare, D. F. 1985. Visualizing Program Designs Through PegaSys. IEEE Computer, Vol. 18, No. 8, August 1985, pp. 72-85.

Oliver, M. A. and Wiseman, N. E. 1983. Operation on Quadtree Encoded Images. The Computer Journal, Vol. 26, No. 1, 1983, pp. 84-91.

Raeder, G. 1985. A Survey of Current Graphical Programming Techniques. IEEE Computer, Issue on Visual Programming, Vol. 18, No. 8, August 1985, pp. 11-25.

Restak, R. M. 1984. The Brain. Bantam Books, New York, October 1984.

Roetling, P. G. 1976. Halftone method with edge enhancement and Moire' Suppression. Journal of the Optical Society of America, Vol. 66, No. 10, October 1976, pp. 985-989.

Rohr, G. 1984. Understanding Visual Symbols. Proceedings of the IEEE Workshop on Visual Languages, Hiroshima, Japan, December 9-11, 1984, n. pag. In xerograph.

Rosenfeld, A. 1984. Quadtrees and Pyramids: Hierarchical Representation of Images. Computer Science Center, Technical Report, University of Maryland, College Park, 1984.

Roussopoulos, N. and Leifker D. 1984. An Introduction to PSQL: A Pictorial Structured Query Language. Department of Computer Science, University of Maryland, College Park, Maryland, July 1984, pp. 1-28.

Reynolds, J. K., Postel, J. B. and Katz, A. R. 1985. The DARPA Experimental Multimedia Mail System. IEEE Computer, Issue on Multimedia Communications, October 1985, pp. 82-89.

Samet, H. S. 1980. Region Representation: Quadtrees from Boundary Codes. Communications of the ACM, Vol. 23, No.3, March 1980, pp. 163-170.

Samet, H. S. 1984. The Quadtree and Related Hierarchical Data Structures. Computing Surveys, Vol. 16, No. 2, June 1984, pp. 187-259.

Sarin, S. and Greif, I. 1985. Computer-Based Real-Time Conferencing Systems. IEEE Computer, Issue on Multimedia Communications, October 1985, pp. 33-45.

Shu, N. C. 1985. FORMAL: A Form Oriented, Visual-Directed Application Development System. IEEE Computer, Issue on Visual Programming, Vol. 18, No. 8, August 1985, pp. 38-47.

Stefik, M., Bobrow, D. G., Mittal, S. and Conway, L. 1983. Knowledge Programming in Loops: Report on an Experimental Course. The AI Magazine, Fall 1983, pp. 3-13.

Tanimoto, S. L. 1976. An Iconic/Symbolic Data Structuring Scheme. Pattern Recognition and Artificial Intelligence, Academic Press, 1976, pp. 452-471.

Tanimoto, S. L. 1983. A Hierarchical Cellular Logic. Technical Report #83-10-06, Departement of Computer Science, FR-35 University of Washington, Seattle WA 98195, October 1983.

Teitelman, W. and Masinter, L. M. 1981. The Interlisp Programming Environment. Computer, Vol. 14, No. 4, April 1981, pp. 25-34.

Venet, M. 1986. A Schematics' Capture Circuit Editor. Masters' Thesis to be defended in January 1986, Electrical and Computer Engineering Department, Illinois Institute of Technology, Chicago.

Von Neumann, J. 1979. The Computer and The Brain. Yale University Press, New Haven and London, 1979. Winston, P. H. 1984. Artificial Intelligence. Addison-Wesley, New York, 1981, second edition: July 1984.

Xerox Corporation, 1983. Interlisp Reference Manual. Xerox Corporation, n. p. 1983.

Xerox Artificial Intelligence Systems, 1985. Lisp Library Packages. Xerox Coorporation, n. p. March 1985.


Markup created by unroff 1.0,    August 19, 2002,    olivier at erire.com