LLVM For IRIX
#11
RE: LLVM For IRIX
I have a paypal verified account, while I'm not particularly a fan of the company (or any big tech company for that matter) it's plenty convenient for buying things online, as opposed to typing one's credit card info in at everyone and their uncle's web checkout... Rolleyes

Project: Temporarily lost at sea
Plan: World domination! Or something...
vishnu
Tezro, Octane2, 2 x Onyx4

Trade Count: (0)
Posts: 1,245
Threads: 41
Joined: Dec 2017
Location: Minneapolis, Minnesota USA
Find Reply
08-13-2021, 06:59 PM
#12
RE: LLVM For IRIX
Really PayPal becomes an issue for people receiving money all the time...buyers don't have any issue because most do you keep ANY PayPal balance...so who cares if the account is frozen. Sellers often keep some money in and that money can be stolen by PayPal under some circumstances. In essence, you may not be really be paid for items you've already sent.

I too use a verified account, with a bank link. But I don't sell enough stuff to be a target.
weblacky
I play an SGI Doctor, on daytime TV.

Trade Count: (10)
Posts: 1,716
Threads: 88
Joined: Jan 2019
Location: Seattle, WA
Find Reply
08-13-2021, 09:41 PM
#13
RE: LLVM For IRIX
Hi again everyone. After the excitement of getting a full LLVM cross compilation toolchain last week, the last thing I needed to complete it (other than runtime stuff like compiler-rt, and the LLDB debugger) was a C++ standard library, therefore, I set my eyes on getting a LibC++ port up and running. 

Most of this work was relatively boring, the main issues I had were that for one thing, IRIX's libc attempts to typedef wchar_t as a long, which conflicts with clang++. Luckily, SGI implemented a macro to disable this for their own C++ compiler, _WCHAR_T, so I #define-d it and the error went away. IRIX's libc was also missing max_align_t, so I added a struct for it into the IRIX support directory I placed into LibC++. Finally, IRIX's libc is missing basically all modern POSIX locale functions and definitions. LibC++ requires a locale implementation, and so I fumbled around with this for a bit until I realized that the LibC++ contributors had anticipated ports to POSIX platforms without a working locale implementation, and thus had written locale stubs. I simply created an xlocale.h implementation that referenced these locale stubs, and finally I only had 2 errors, IRIX doesn't have an wcsnrtombs or mbsnrtowcs implementation, so I made some stubs that returned null. Finally it was onto the linking stage for LibC++. 

Of course, my C++ hello world didn't link. LibC++ gets built for the target platform that the compiler is going to run on as part of building LLVM, and I was building clang/LLVM/LibC++ to run on Darwin. It was time to build a native LLVM for IRIX! I downloaded the SGUG libstdc++ from the Octane and I set out to build LLVM using my cross compiler. I ran into issues right away. It's actually surprising how out of date most of the documentation is in regards to cross building modern LLVM toolchains, no matter what flags I passed in from the LLVM cross build docs, it kept trying to build for my host machine and attempting to pull in my host libraries. It took a long while of searching around the web before I finally realized that I should be passing a CMake toolchain file into the cmake stage. It was finally off to the races!

I ran into some issues again, that I'm still not sure of what they even were, since there weren't any obvious error messages or anything being dropped into the logs, I didn't investigate further however because I ran into another showstopping issue, LLD was emitting IRIX-incompatible shared libraries. I only noticed this because I realized that LLVM was going to need libxg to build for IRIX, since I used that to provide a few missing standard C/POSIX functions that IRIX doesn't have when I was trying to compile LLVM on IRIX itself. I went to build libxg and ran into one issue, LLD was trying to use crtbeginS.o and crtendS.o from the GCC multilib directory. GCC on IRIX however doesn't use these object files. I figured I could simply symlink crtbeginS.o and crtendS.o to the regular crtbegin.o and crtend.o respectively, but just to make sure, I checked the shared library linking behavior of Binutils LD on IRIX, and as I suspected, it just used the normal crtbegin.o and crtend.o. I symlinked it and got a shared library out of LLD, however when I went to test it on IRIX, rld completely rejected it

   

Here's what the SGI ELF64 manual had to say about it:
   
Raisedeyebrow

Okay, fine then, what if we try forcing the shared library base to 0x10000 using -Wl,-image-base=0x10000 on the LDFLAGS? This should make sure that every section inside the shared library is required to be loaded into the application's memory at least at 0x10000 in RAM (although this might get moved depending on if there's another shared library that's trying to load into the same area in RAM). 

(I don't have a screenshot of it, but IRIX returned a segmentation violation when I tried loading a shared library after this "fix")

Okay, what is even going on? 

Sidenote: In hindsight this should have been a very easy clue, shame on me I suppose for not knowing as much about how ELF binaries are loaded into memory before this week

I decided to forcibly set the DT_MIPS_BASE_ADDRESS inside LLD to 0x10000, this time, I got this upon running a test program with the shared library:

   

It actually ran, but rld was emitting warnings. They were pretty annoying and I was worried that what I was doing could result in undefined behavior, so I set to work on figuring out how to actually set the DT_MIPS_BASE_ADDRESS in a way that rld was happy with. To narrow down what LLD was doing differently, I generated 3 different variants of a test shared library, one using clang and Binutils LD, one using the -Wl,-image-base=0x10000 flag, and one using my forced DT_MIPS_BASE_ADDRESS trick. This unfortunately led me to chase a red herring, upon comparing the program headers of the 3 shared libraries under readelf, I came to the incorrect conclusion that the executable segment of the image base flag shared library being set to 0x20440 was what was causing the segmentation violation. I assumed this because in the shared library generated with my DT_MIPS_BASE_ADDRESS trick, the executable segment was set to 0x10440, and in the Binutils LD shared library, the executable segment was set to 0x10000. I forcibly set the executable sections to generate at an offset below 0x20000 in memory, however this still caused a segmentation violation.

I really wasn't sure where to go from here, I tried passing the default Binutils LD linker script to LLD, which didn't work at all, LLD errored out and complained about some of the symbols being placed over one another. I was about to throw in the towel and accept the warnings for now, when I came to a realization I should have had a long time before, I should create a shared library using MIPSPro and compare it to the result of clang + Binutils LD. Once I did this, I noticed something immediately:

   

   
(Top is Binutils LD, bottom is MIPSPro)

In the both of them, not only was the executable PT_LOAD segment placed right at the base address with an offset of 0, but it was the very first PT_LOAD segment. I quickly found this reference diagram from the Linux Foundation and everything clicked into place:

   

As far as I can tell, this is what was happening before; rld expects the PT_LOAD segment that's getting placed at or near 0x10000 to be executable, in the case of the DT_MIPS_BASE_ADDRESS trick, this managed to work fine since the only segment near 0x10000 was the executable segment, it was getting placed at 0x10440, and every other segment was getting placed above or below 0x10000. The elf headers were supposedly getting placed below 0x10000, however I suspect rld was shifting them up to 0x10000 using the DT_MIPS_BASE_ADDRESS, which could also potentially explain the warning rld was emitting about a movement calculation error. I wasn't getting so lucky with the image base flag version of the shared library though. Over there, rld must have been mmap-ing the first PT_LOAD segment which was placed by LLD at 0x10000 into memory, and then attempting to jump into the segment, thus causing IRIX to throw a segmentation violation since the segment wasn't marked as executable.

It took me a day or 2 of experimenting with LLD but I finally managed to figure out a way to force LLD to make the first PT_LOAD segment flagged as executable and to place it at 0x10000. In the end, after all of my poking and prodding, only a few lines of code needed to be changed inside LLD, but hey it worked. rld wasn't throwing any errors or warnings, and there weren't any segviolations. I'll likely have to improve the method I'm using though since while it works, I'm not sure if it'll cause problems on other platforms that use ELF. I might have to come up with a flag to signal it's building for IRIX. That's something I'll worry about later on though, I suppose now I can go back to figuring out how to build a native LLVM for IRIX.

If any of you guys are interested in supporting me financially, I opened up a patreon. Any amount is appreciated, not only does it help me out, but it means that there's people out there who genuinely care about what I'm doing, which would mean the world to me Biggrin . As always, the code is up on GitHub.
(This post was last modified: 08-21-2021, 10:44 PM by aurxenon. Edit Reason: Clarification )
aurxenon
O2

Trade Count: (0)
Posts: 8
Threads: 1
Joined: Aug 2021
Location: United States
Find Reply
08-21-2021, 10:37 PM
#14
RE: LLVM For IRIX
Awesome. I will be donating to Aurxenon's Patreon myself in the future, just to ensure that he gets exactly what he needs to continue doing these great things. He's getting a lot of gear from me for helping with all this.

I'm the system admin of this site. Private security technician, licensed locksmith, hack of a c developer and vintage computer enthusiast. 

https://contrib.irixnet.org/raion/ -- contributions and pieces that I'm working on currently. 

https://codeberg.org/SolusRaion -- Code repos I control

Technical problems should be sent my way.
Raion
Chief IRIX Officer

Trade Count: (9)
Posts: 4,240
Threads: 533
Joined: Nov 2017
Location: Eastern Virginia
Website Find Reply
08-21-2021, 10:46 PM
#15
RE: LLVM For IRIX
Hey people, sorry for not updating for so long. I didn't die, it just turned out that C++ support ended up being far more complicated than I initially intended it to be (mostly because of my own stupidity), and also uni started back up so I haven't had as much time each to work on this as I would have liked. This post isn't going to be quite as comprehensive about things I did as my previous posts have been. A lot has happened in the past few weeks, including this project getting featured on the BSD Now podcast  Smile , me getting into contact with Eschaton (the first person to attempt a port of LLVM for IRIX), and best of all, a new contributor joining me! Vladimir Vukićević is probably the first person I've talked to more than once who has a Wikipedia article written about them, and he's been absolutely invaluable. He's taken the initiative on reexamining and fixing assumptions and hacks I've made both now and in the past, as well as extending LLVM's support for IRIX through debugging bugs caused by clang/LLD interactions with rld. He's the first to actually get clang and LLD to run on IRIX, and I'm genuinely so grateful for all of his help. A small community is beginning to form around this project, and I'm happy knowing that even if I were to quit for some reason (I'm not planning on it), the project would continue progressing without my involvement.

Starting from the beginning, after the last update, I took a break for a few days from LLVM and worked on other stuff, so when I came back to LLVM, I immediately dived headfirst into C++ support. I picked up where I left off before the shared libraries bug with getting libc++ and libc++abi up and running. I figured out what was causing that bizarre cmake failure from last time, it turned out that cmake was missing an IRIX platform file, as well as IRIX definitions inside the libc++ CMakeLists.txt. I added the platform file from the SGUG cmake port, and I added some IRIX flags to the CMakeLists.txt. 

I was onto the build process, which at this point mainly consisted of me fighting with both the libc++ headers, and GCC's include-fixed headers. The main sources of frustration were anything involving va_list's and wchar_t. clang and GCC both define wchar_t themselves, however the IRIX headers have a typedef-d wchar_t for MIPSPro, causing a type redefinition error on both clang and GCC. The IRIX headers define va_list as a char*, which GCC doesn't expect, and so the GCC include-fixed headers provide alternate functions that use its own builtin __gnuc_va_list type. clang doesn't know about this however and so it tries to use the normal IRIX va_list, inevitably causing an error because it can't find functions using the IRIX va_list. There wasn't really any easy way to patch this within the libc++ support files and so I ended up simply hacking away at the include-fixed headers so that the IRIX va_list functions would be available when building with clang. 

There were some other bugs in libc++ I had to fix through defining preprocessor macros and turning off certain functionality, unfortunately my memory's pretty fuzzy on them since it's been nearly 3 weeks since I was working on them. I also extended libxg to provide strerror_r. Around this time I built libc++abi and got a whole lot of errors related to relocations on read only segments

   

I disabled the errors by passing -Wl,-z,notext and they went away, however unfortunately this would still end up coming into play later on.


I got libc++ and libc++ abi to both build, and I linked a C++ hello world, however rld surprised me with this:

   

iswblank was a relatively easy fix, it turns out that IRIX exports iswblank in their C library as _iswblank, so I simply defined an iswblank macro for libc++ that pointed at _iswblank. Those other 2 errors were more involved. I didn't know what they were so I demangled the function names, and realized they were the aligned operator new, and aligned operator delete. I had to create a posix_memalign in libxg for libc++ so that I could use libc++'s aligned memory allocation function, and thus enable the missing operator overloads that were causing this error.

Finally I had resolved all the missing functions, and I went to run my hello world program, only to hit this

   

I ran the program in GDB (without any debugging symbols) and realized it was segfaulting when it was calling a constructor. I peppered some printf statements around and confirmed that it was crashing right here

   

Now here's the rather dumb part. I ended up messing around for a little over a week in my off hours trying to get the LLD output to more closely match Binutils LD in terms of the ELF segment layout, I assumed this because, for one thing, I was still quite leery of my previous hack I had done to get LLD to output an executable text segment at the DT_MIPS_BASE_ADDRESS, it resulted in a very suspicious looking "ghost" executable segment, that while it technically worked, was definitely one of those things I cannot explain. See for yourself:

   

On top of this ghost segment issue, I was also leery of the relocation errors I was getting prior. Binutils LD only has two PT_LOAD segments, one for executable code and read only data, and one read/write segment for everything else. The eh_frame section is inside that read/write segment, and so I assumed that was potentially causing a bug in rld, causing it to improperly relocate runtime data.

I'm not gonna bore you guys with the details, because, well, it was really boring. I managed to get the LLD output to more or less mirror the Binutils LD output, however it still didn't work. It was still crashing in the exact same place. I paused on the project for a few days since I was starting to feel quite drained.

I came back though, refreshed and ready to get to work, and so I decided this time around, screw libc++, let's see what happens if we try and use GNU libstdc++. I created 4 test binaries using a combo of g++, clang++, Binutils LD, and LLD. Out of the 4, only a combination of g++ and Binutils LD actually produced a working binary that didn't just immediately segfault. (I later on got a test binary from MIPSPro however it ended up being irrelevant). I decided to step through all of the nonworking binaries in GDB, and that's when I finally noticed something that I'd completely missed the first time around, the programs were not only all segfaulting at the same place, but they were segfaulting when they were trying to run operations using member variables of ostream objects.

   

This didn't make any sense though, the objects themselves were all allocated in memory, why were they all full of invalid member variables? I then noticed that, wait a second, the objects aren't on the stack like a normal variable, they're all off in memory occupied by the shared library, this means they're all global variables. 

How is a global variable initialized? Good question, in C, the compiler is responsible for assigning values to global variables, at compile time, it allocates space in the .data section for the variable, and writes in the constant value to it, and the linker will simply add this data section into the final binary. C++ is somewhat different though, the compiler will perform the same operation for any constant initializers, such as say, int x = 5; For non constant initializers though, such as int* x = new int[5]; the variable will be initialized at program startup, before the main function is even called. The specific way the runtime initializes variables can differ between compilers though, for most modern platforms, GCC's runtime will look through the entries in the .init_array section and call each global constructor placed inside to initialize any global variables. Other operating systems such as IRIX differ though, there, GCC's runtime will instead look backwards through the .ctors section in order to call any global constructors. Regardless of the method however, the global variables should always be initialized by the global constructors before the main function is even entered so that they're immediately available for use with valid contents.

I didn't make the connection before, but std::cout is one of those global objects, and just like other global objects, it needs to be initialized by the runtime before it can be used, explaining why it was causing a segfault everytime I tried using it without it being initialized. Just to double check that my theory about the global constructors not being executed was correct though, I decided to make a test program that simply initialized a global object of a custom class, then called member functions on it to read its contents. As I suspected, none of the values that were returned had been initialized

   

This should have returned 5 and Hello! for X and Y respectively.

Around this time was when Vladimir Vukićević joined me, and he made rapid progress on cleaning up the various fixes and hacks I'd applied, making it easier for newcomers to get started on setting up an IRIX clang/LLD cross compiler, natively hosting clang/LLD on IRIX, and debugging/fixing his own issues he ran into while cross compiling software with clang/LLD. 

It was pretty easy to get a working binary out of clang and Binutils LD, I passed -fno-use-init-array to clang so that it would revert to the older .ctors method of generating a global constructors list, then passed the object file through to Binutils LD, resulting in a working hello world. clang with LLD proved to be more tricky though. Even with the command line flag, it still wasn't calling any of the global constructors at all, even though they were now being stored in the .ctors section instead of the .init_array. I dumped a list of the symbols present in the clang/Binutils LD binary, and a list of symbols present in the clang/LLD binary, and I realized that clang/LLD was missing a lot of the symbols necessary to actually call all the global constructors. 

This was pretty baffling to me, at first I assumed that maybe these symbols just weren't being read in from the crt objects GCC provides, however I was able to confirm that LLD was seeing them by placing a printf at the function where LLD loads in object file symbols, and it directly printed out the list of symbols that were missing from the final binary. I then assumed that maybe LLD was somehow deciding they were unnecessary and was discarding them? This didn't really make a whole lot of sense though so I opened up the crtn.o file that contained these symbols and realized something right away

Do you remember this screenshot?

   

It was from one of my earlier posts, it turns out that the local symbols, which were all related to global object initialization, were mixed in with the global symbols. I figured that the way LLD was preserving symbols for the final binary was only including symbols from the position before the global and weak symbols appeared. I searched through the LLD code and I found my suspicions confirmed

   

   

Continued in part 2 since I've hit the file limit for this post
aurxenon
O2

Trade Count: (0)
Posts: 8
Threads: 1
Joined: Aug 2021
Location: United States
Find Reply
09-14-2021, 10:45 PM
#16
RE: LLVM For IRIX
!!!!!!CORRECTION!!!!!!
When I said crtn.o in the last post, I actually meant crti.o in all of the instances I referred to it, although crtn.o also had similar issues

Perhaps I was wrong when I said I was going to be more concise this time around?

I began writing a kludge to simply force the getLocalSymbols to return all symbols in the object, and let copyLocalSymbols() deal with the work of determining which symbols were actually necessary to operate on, but then vladv informed me that llvm-objcopy would actually do most of the work of reordering the symbols correctly inside the object file. I ran it on crti.o and crtn.o, and hey, whadayya know, it worked perfectly, I didn't even have to pass any flags. This STILL wasn't the end of the global constructors saga though, I was still getting a segfault, and so I ended up spending some time researching online how the global constructors are actually called by the runtime, until I eventually found this comment on a GNU irix-crti.s file on google

   

Following the advice of this comment (bless you beautiful past people who left this knowledge for future porters), besides my previous -fno-use-init-array, I added -Wl,-init=__gcc_init, and -Wl,-fini=__gcc_fini to the clang command line, as well as added some extra code to 2 places in LLD to ensure that __gcc_init and __gcc_fini aren't tampered with. The next time I ran the clang/LLD output, I hit an illegal instruction error rather than a plain old segfault. I yet again did the work of stepping through the binary in GDB, and this time I found that the binary was calling the __gcc_init function and then... more or less just falling into the abyss, it ran past the end of the function and kept calling what it thought were instructions next to it in memory. 

I dumped a copy of the __gcc_init symbol from clang/Binutils LD's binary, and a copy from clang/LLD's binary, and found that __gcc_init was missing some return instructions in the clang/LLD version

   
clang/LLD version

   
clang/Binutils LD version

I puzzled over this for a bit, until I decided to dump the __gcc_init symbol from all 4 C runtime object files that LLD and Binutils LD should pull in. crtbegin.o, ,crti.o, and crtend.o all appeared in both versions of the binary, the one exception was crtn.o, that object file contained the function ending/return statements, and this was what was missing from the clang/LLD version. I ran clang/LLD in verbose mode and discovered yet another IRIX quirk. Most platforms will only have 1 crtn.o, however IRIX has 2 if you're using libgcc as your runtime, one from IRIX in /usr/libXY/mipsZ/crtn.o, and another from /usr/lib/gcc/mips-sgi-irix6.5/$GCC_VERSION/crtn.o 

When clang was constructing the LLD flags, it didn't know that IRIX was a bit special, so since /usr/libXY/mipsZ was higher in the path hierarchy, it pulled in the IRIX crtn.o rather than including both the IRIX crtn.o and libgcc's crtn.o. I mentioned it to vladv and he quickly wrote up a patch for this, as well as another patch to make sure the correct init and fini flags are automatically applied. 

Finally, after all of this work, guess what, we got a working Hello world! from clang/LLD using libstdc++!

To be clear, the global object saga still isn't totally over, there's still bugs, and vladv's been hard at work fixing a relocation bug related to global statics in particular. But we've progressed enough now that we can get simple C++ programs to work, and I felt we'd reached a point where I could give a tangible update on progression.

Stepping away from the global objects bugs for now though, vladv is the first person in the world to run clang/LLD, targeting IRIX, on real SGI hardware under IRIX!

The reason I mostly talked about my own changes in this post is mainly because these posts are mostly about what my thought process was at the time I was making each change, and well, I'm not vladv, so I can't talk about what his thought process specifically was at the time of each change.

I'm extremely happy with all the work he's done, it's been awesome having another person who's just as motivated (possibly even more than I am) to seeing this port through, and he's incredibly sharp. 

Now that vladv joined the project, the GitHub situation has sort of changed, I try and keep our 2 repositories in sync, but vladv commits way more often than I do, and he lives in a different timezone than I do, so they might fall out of sync. Thus just in case:

My Repository Link

vladv's Repository Link

If you'd like to support me financially, here's my patreon.
aurxenon
O2

Trade Count: (0)
Posts: 8
Threads: 1
Joined: Aug 2021
Location: United States
Find Reply
09-14-2021, 11:52 PM
#17
RE: LLVM For IRIX
I finally had some time to read up on all the work you did, absolutely amazing work so far! I was also sad to see that no-one has backed your Patreon yet, so I made a start Smile

O2 O2 O2 Octane2 1600SW 1600SW 1600SW Presenter
karpour
O2

Trade Count: (0)
Posts: 20
Threads: 6
Joined: Sep 2020
Location: Austria
Find Reply
11-23-2021, 12:19 AM
#18
RE: LLVM For IRIX
I plan to as well. His focus has changed but he should be backed for his hard work.

I'm the system admin of this site. Private security technician, licensed locksmith, hack of a c developer and vintage computer enthusiast. 

https://contrib.irixnet.org/raion/ -- contributions and pieces that I'm working on currently. 

https://codeberg.org/SolusRaion -- Code repos I control

Technical problems should be sent my way.
Raion
Chief IRIX Officer

Trade Count: (9)
Posts: 4,240
Threads: 533
Joined: Nov 2017
Location: Eastern Virginia
Website Find Reply
11-23-2021, 01:15 AM


Forum Jump:


Users browsing this thread: 1 Guest(s)