Seeking a good Greek grammar guide

esnaiig · 20 · 9717

joe

  • Modern Greek Verbs
  • Newbie
  • *
    • Posts: 69
I should mention, I did the figures in the chapter on Message Perspective using only the Unicode character set. Microsoft Internet Explorer doesn't like Unicode. It goofs up the figures, but Mozilla and Firefox work fine. I'll eventually redo them as gifs.

In order for Internet Explorer to work OK with Unicode you have to make sure all your font declarations in your css are for unicode fonts (MS Arial Unicode for example). Firefox can cope with this not being so as it replaces missing characters with glyphs.


Γεια σου Σπιρο,

Καλώς σε βρήκα. Τι κάνεις;

1. This HTML does not use a CSS.

2. The only explicit reference to the character set occurs in the HTML <head>:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

This tag is necessary, because IE does not use Unicode by default. Without it, Windows users must set the Browser encoding by hand, to UTF-8, to see the Greek, and Windows rules the world, so I have to conform.

3. Microsoft Times New Roman is a Unicode font. M-TNR supports all the Unicode characters I'm interested in, including the box-drawing characters. I swiped a copy from Windows and installed in on my Linux box at home just to make sure.

Actually, you can't call a font by its character set name. There is no such thing as a "unicode" font. Fonts support whatever characters they want, usually a international subset of the Unicode. In fact, M-TNR supports all the European languages, including Cyrillic. But it does not support the Asiatic writing systems. You'd need a font designed especially for them.

Anyway, the same font is used by all character set encodings. All fonts, at least the SGV fonts developed by Adobe, of which Microsoft True-Type is a rip-off, map the character set to the glyph set internally.

There is a table inside the font for each character set supported. If the font does not contain a map for your character set, then the font cannot be used with that character set. But Unicode is supported by all the common (if not alsolutely all) fonts. No, that's not true. I've seen speciality fonts which support only their own, small, special-purpose character sets.

Microsoft TNR is also a "Greek" font. In fact, Microsoft TNR even supports polytonic Greek (but not on Windows 98, nor on Xp, apparently). There is no glyph substitution. The polygreek looks pretty as a picture on Linux. (I have  fonts on my Linux which do not support polygreek, and you can see the glyph subsititution.)

The only problem seems to be that Internet Explorer is not using the font properly.

How do I know?

Because Firefox on the Windows platform correctly (well, almost correctly) renders all the greek characters, and the Unicode figures too. Both browsers, IE and Firefox are using the same font, the Microsoft TNR, and the same platform, Windows Xp.

To see the difference, compare the following URL, an the entry from Babiniotis dictionary, with both Firefox and Internet Explorer:

συνίσταμαι

You'll see one works better than the other. Mozilla works like a charm on my Linux machine at home. It's perfect.

Joe

PS

Microsoft is, like IBM was, and still is, a marketing company, not a technology company. Their engineering is mediocre, to say the least - limited to "reverse-engineering" - and their attention to detail even worse:

1) Windows still has an incomplete or defective implementation of Unicode, and in particular, utf-8. They just refuse to cooperate.

2) IExplorer has a defective implementation of HTML:

- Can't embed <TABLE> into <A>
- Can't embed <UL>  (or any list) into <TABLE>
- I wonder what IE would do if I put a <UL> in a <TABLE> in an <A>, cry?

When things break, you just change your page, because IE is the de-facto standard. That's what I'm going to do with those Unicode figures. I can create GIFs using the Gimp.

3) IE does not correctly use Microsoft's own fonts! (see above) Actually, I don't think they're Microsoft fonts. They've been developed my another company. Microsoft just distributes them.

4) Frontpage breaks working HTML.

I use Frontpage because it is the only "reasonable" utf-8 editor on a Windows platform (I don't consider Word to be "usable", at least not for HTML (see below)), and Notepad, although utf-8, doesn't understand Unix new-lines.

When I do a Search/Replace to finalize my URLs, Frontpage breaks the <ul> inside a </blockquote/> near the bottom of the page:

Welcome to Modern Greek Verbs

It goofs up, but I haven't taken the time to figure out why.

[Hey, Maybe they should call it "Breakpage" instead of "Frontpage".]

So I do the search/replace using Notepad, but since Notepad doesn't recognize the Unix new line character (which is the single-byte ASCII line-feed and has the same value in utf-8, iso 8857-9, and microsoft greek), I can't see what I'm doing: the text is just a blob, a very long, single line.

[Notepad only understands the CR/LF combination. I haven't figured out how to shut that off. All my Linux editors can do both forms of line termination. Maybe I should use the Microsoft CR/LF with UTF-8.]

But at least the Notepad replacement works.

And Word?

Forget it. It's a dinosaur. Sure, it recognizes Unix newline, (Frontpage also understands Unix newline) but Word makes you work with your html as a RTE (rich-text-editor). Besides that, it mangles your page beyond recognition, so bad in fact, I've used it at times to disguise the identity of the author (me).

My guess is Frontpage is either incorrectly analysing the source HTML - which it should not be doing anyway - or, even worse, it may just be broken, from lack of attention.

Unicode was first "supported" in Windows NT, back in 1997. Windows 98 never had a Unicode editor. (W98 "Word Pad" did UTF-16 only, while Notepad on NT was utf-8. Figure that one out...)

Windows XP is better.

XP Notepad currently supports UTF-8, but still puts the BOM (Byte-Order-Mark) in byte positions 1 and 2 of the text file, harmless, but something which was only necessary in UTF-16 - to signal big/little endian byte order of the UTF-16 integers to the recipient - but utf-16 was never real Unicode anyway.

I guess you could call UTF-16 "Microsoft Unicode".

Nobody is perfect

Firefox has a bug too, but not in the Unicode or HTML. Πρόκεται να refreshing the page. It forgets where you were, and always restarts at the top of the page. This behavior essentially makes Firefox useless for web developement.

For example, using Firefox visit:

Modality

Scroll down and press Refresh. You'll see what I mean.

Curiously enough - because they should be sharing the same code - Mozilla works fine, but it too is broken when you refresh a # (pound-sign) URL. For example, using Mozilla visit:

Causitive Verbs

Now, scroll down, then press Refresh. Same bug! Mozilla forgets where you were and puts you back at the start of the #causitive section.

Internet Explorer works correctly in both cases

PPS

Just Some History

IBM stuck to EBCDIC until its dieing day. [Well, it's not actually dead. The US Government still uses IBM mainframes to print millions of pay checks, twice a month.] You won't find EBCDIC on your list of browser Encodings though, unless maybe Mozilla. [Well, who would be crazy enough to develope a web page in EBCDIC anyway?]

For all you character set fanatics out there, both ASCII and EBCDIC map the decimal digits to the BCD (Binary Coded Decimal) range. On a S/370 you could actually do arithmetic using the EBCDIC characters without converting them to numbers first. It had a BCD ALU (Arithmetic Logic Unit). That was common way to add numbers in COBOL, which always dealt with money, decimal digits and fixed decimal point. (The S/370 also had a FP-ALU. Fortran needed it. And Fortran sent Apollo to the moon.)

By comparison, the Intel processors have always had packed BCD addition (Motorola never did), but you have to pack the digits first, add them, then apply an ASCII adjust afterwards.

Why would anyone want to do this?

Because money is always represented as a fixed (not floating) point number. Yes, integer addition would work, but since the data is captured as (ascii of ebcdic) characters first, you'd have to convert them to integers first, then back to characters. So why not just add the characters?
 


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 815679
    • Gender:Male
  • point d’amour
Quote
There is no such thing as a "unicode" font..

Talk less, say more (especially when you are dealing with practical problems).

Yes, there is such thing as a unicode font and use Arial Unicode MS (it is a 20 MB font if you check your system in order to fit all scripts!).

If you use no css I suggest you start using it. Otherwise you will have to repeat style information all over your html. Using css is a much more tidy way. Alternatively you will have to do something like this:

Code: [Select]
<font face="Arial Unicode MS">Your text</font>
Quote
Arial Unicode MS
From Wikipedia, the free encyclopedia
Jump to: navigation, search

In digital typography, Arial Unicode MS is an OpenType font based on the Arial font. It includes 51180 glyphs and covers a large subset of Unicode 2.1, including support for most previous Microsoft code pages. The font was designed by Agfa Monotype under contract to Microsoft.

Arial Unicode MS was previously available for download, but now is only distributed along with Microsoft Office.

Other well-known fonts with Unicode coverage include Cyberbit, Code2000, Doulos SIL, Lucida Sans Unicode, and the Free software Unicode fonts.
https://en.wikipedia.org/wiki/Arial_Unicode_MS
« Last Edit: 27 Feb, 2006, 22:32:57 by spiros »



joe

  • Modern Greek Verbs
  • Newbie
  • *
    • Posts: 69
Good Info Spiros, Thanks.

It looks like Arial Unicode is just a fat font. It says you shouldn't use it as a default font for Word. Hm... Over 58,000 glyphs is not very practical.

And it says it was developed by another company. Interesting. It also confirms my suspicion that all fonts are unicode fonts, just the level of support changes.

And I'll bet the Arial Unicode also does the iso and windows encodings too, which means calling it a "unicode" font is a bit misleading, don't you think? Maybe they should call it "Universal or Uniform Arial". That's the kind of hair-splitting the Internet gurus over at 3wc do. Is a URL a "uniform" or a "universal" resource locator? (It's a "uniform" resource locator)

Same thing for "Greek" fonts. Polygreek used to be rather rare, but when I saw it was part of M-TNR, I got all excited, only the browsers on Windows, both IE and Firefox, are not getting to it, IE not at all, and Firefox is using font substitition. I wonder how Mozilla works on Windows.

I've used CSS. It's not bad, but it has nothing to do with font support. It won't help either browser find the M-TNR glyphs. It just selects the font face. And Arial is not the one I want. Sorry.


joe

  • Modern Greek Verbs
  • Newbie
  • *
    • Posts: 69
Quote
There is no such thing as a "unicode" font..

Talk less, say more (especially when you are dealing with practical problems).

Yes, there is such thing as a unicode font and use Arial Unicode MS (it is a 20 MB font if you check your system in order to fit all scripts!).

If you use no css I suggest you start using it. Otherwise you will have to repeat style information all over your html. Using css is a much more tidy way. Alternatively you will have to do something like this:

Code: [Select]
<font face="Arial Unicode MS">Your text</font>

I just discovered that Windows computers come with two polytonic Greek fonts:

  • Arial Unicode MS - Sans Serif
  • Palatino Linotype - Serif

This will let me build a Lexicon of Ancient Greek Verbs. It also gives me the opportunity to use a CSS. You can specify the typeface like this:

Code: [Select]
BODY  {font-family: 'Palatino Linotype',Palatino,'ClasGaramond',Garamond,'Times New Roman',Times,'CG Times',serif;}

The other thing that's essential is to specify the character set as UTF-8, like this:

Code: [Select]
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
With these two parameters, anyone should be able to visit the Lexicon of Ancient Greek Verbs without configuring the browser. The font is fully scalable.

Joe



spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 815679
    • Gender:Male
  • point d’amour

 

Search Tools