deutsch     english    français     Print

 

6.1 HTML, STRINGS

 

 

INTRODUCTION

 

HTML (Hyper Text Markup Language) is a document description language for websites. A website shown in the browser, however complicated it might appear, is generated from an ordinary text file that contains markups for the layout in addition to the visible text. This additional information determines the appearance of the website. These consist of a tag pair with both a start and end tag. The start tag begins with the angle bracket < and closes with the angle bracket >; the end tag starts with </ and is also closed with >.

The basic structure of a HTML text file consists of the tags <html> and <body> as well as the corresponding end tags.

<html>
   <body>
      TigerJython Web-Site
   </body>
</html>

The letter case of the tags, as well as the line breaks and indentation, do not matter for the layout of the document.

PROGRAMMING CONCEPTS: HTML, hyperlink, string, constant data type

 

 

WHAT ARE STRINGS?

 

In many programs, including in the context of the web, you need a data type in order to store text. This consists in a stringing together of letters (a character string) that you can type with the keyboard. In addition, you will need some control characters to do things such as indicating a line break. In Python you use the data type str for character strings.

The text of a string is placed between double or single quotes. You can interpret strings as lists whose elements are individual characters. Most familiar operations for lists are also applicable to strings, but with one important difference: You can get a single character from the string with an index (square parentheses), but you cannot change the character with an allocation because the string is a immutable data type. If you want to change a string, you have to create a new one [more... In Python strings are represented internally with 8-bit ASCII characters.
Substituting a u before the string, so the representation in the 16-bit Unicode takes place. Characters are in Python strings of length 1
].

Your program defines HTML-formatted text as a string html and writes it out to the console.

html = "<html><body>TigerJython Web Site</body></html>"
print html

In order to run through a string character by character, you cann use a for loop with an index:

html = "<html><body>TigerJython Web Site</body></html>"

for i in range(len(html)):
   print html[i]

It is more elegant, however, to use a for loop with the keyword in:

html = "<html><body>TigerJython Web Site</body></html>"
for c in html:
   print c

A string can also contain special control characters. These escape character are initiated with a backslash, for example the character for a new line \n (newline, also called a linefeed <lf>). One example is creating the format shown in the very beginning of the chapter with:

html = "<html>\n   <body>\n      TigerJython Web Site\n   </body>\n</html>"
print html

You can also read texts from a text file. To do this, create the file welcome.html with any text editor in the directory where your program is located in, with the following content:

<html>
   <body>
      <h1>TigerJython Web-Site</h1>
      Good morning
   </body>
</html>

You draw a heading with the tag <h1>. Your program reads the text file in the html string and then writes it out again to the console.

html = open("welcome.html").read()
print html

 

 

MEMO

 

A string is a constant object consisting of individual characters. You can read individual characters with an index. However, if you try to replace a character with an allocation you will get an error message. There is no character type in Python, since single characters are also considered as strings.

Text files are opened with open(). With this, you deliver the path to the file. The path can be relative to the directory where tigerjython2.jar is located, but also absolute when you prepend a fraction line (in Windows, possibly also a drive letter), for example:

 open("test/welcome.html") welcome.txt in the subdirectory test of the home directory of tigerjython2.jar
 open("/myweb/test/welcome.html") welcome.txt in the directory /myweb/test of the drive where tigerjython2.jar is located
 open("d:/myweb/test/welcome.html")

(only for Windows)
welcome.txt in the directory \myweb\test of the drive d:

You can connect two strings with the addition operator + (concatenate). However, it is important that both operands are really strings. For example, "pi = " + 3.1459 leads to an error message. In this case, you have to write "pi = " + str(3.14159) so that the number is first converted into a string.

The most important operations with strings:

 
 

s = "Python"

s[i]

s[start:end]

s[start:]

s[:end]

s.index(x)

s.find(x)

s.find(x, start)

s.find(x, start, end)

s.count(x)

x in s

x not in s

s1 + s2

s1 += s2

s * 4

len(s)

defines a string (or with the single quotes s = 'Python')

accesses string character with index i

new sub-string of characters start to end, but without end

new sub-string with characters from start

new sub-string with characters from end, but without end

index of the first occurrence of x (-1: not found)

index of the first occurrence of x (-1: not found)

index of the first occurrence of x from start

index of the first occurrence of x from start to end

returns the number of occurrences of x

returns True if x is contained in s

returns True if x is not contained in s

concatenation of s1 and s2 as a new string

replaces s1 by the concatenation of s1 and s2

repeats new string with characters s four times

returns the number of characters

 

 

 

WEB BROWSER

 

The most important task of web browsers is to interpret the HTML tags and display the page on a screen window according to the layout information. You can display the file welcome.html on your PC after installing a web browser (Firefox, Explorer, Chrome, Safari, Opera, etc.).

TigerJython provides you with a simple browser window as an instance of the class HtmlPane. The method insertText() causes the input string to appear as a web page in the window.


 

 

from ch.aplu.util import HtmlPane

html = open("welcome.html").read()
pane = HtmlPane()
pane.insertText(html)
Highlight program code (Ctrl+C to copy, Ctrl+V to paste)

 

MEMO

 

A web browser interprets the HTML markups and displays the document according to its layout information.

HtmlPane knows only the basic HTML tags. Displaying complex HTML pages is not supported. You can also use a HtmlPane to display your program output in a separate window with a pleasing layout, rather than write it out to the console.

 

 

HYPERLINKS

 

The explosive propagation of the web is essentially attributed to the fact that a website can contain elements that lead, by a simple mouse click, to other websites that could be located on any other web server, even far away on the world. Elements of this type are called hyperlinks. Hyperlinks can build an interconnected information structure, similar to a spider web.

Create the file welcomey.html again with a text editor that contains the link tag <a>. Now we also use the paragraph tag <p> which defines by default a new section with a line break.

<html>
   <body>
      <h1>TigerJython Web-Site</h1>
      <p>Good morning!</p>
      <a href="http://www.tigerjython.ch/">TigerJython Home</a>
   </body></html>

You have to enable hyperlinks in your program by defining the function linkCallback() (or any other name) and registering it with the named parameter linkListener. Clicking on the link leads to the invocation of the callback whereby the URL contained in the link tag is delivered.

from ch.aplu.util import HtmlPane

def linkCallback(url):
    pane.insertUrl(url)

html = open("welcomex.html").read()
pane = HtmlPane(linkListener = linkCallback)
pane.insertText(html)
Highlight program code (Ctrl+C to copy, Ctrl+V to paste)

 

MEMO

 

Hyperlinks are cross references in a document with which you can jump to other documents. Linked documents are a characteristic feature of the World Wide Web.

Unfortunately, the display of web pages with HtmlPane is incomplete. However, you can use the default browser with HtmlPane.browse() [more...Programmatic invocation of another program or process is called "spawn"].
from ch.aplu.util import HtmlPane
HtmlPane.browse("www.tigerjython.com")

 

 

EXERCISES

 

1.


With the tag 

<img src="gifs/tigerlogo.png" width="120" height="116"></img>

you can embed an image that is located in the subdirectory gifs of the directory where your program resides.

Create a file showlogo.html and a program that shows the following in a HtmlPane:

You can download the image tigerlogo.png here.

2.

Define the strings last name, first name, street, and location as well as the house number and zip code either with your personal information or with something made up. Link these strings together into a single string address with the + sign, so that print(address) writes out the formatted information:

first name, last name
house number, street
zip code, location