The International Simutrans Forum

 

Author Topic: Experimental on Linux  (Read 11388 times)

0 Members and 1 Guest are viewing this topic.

gyom

  • Guest
Experimental on Linux
« on: April 16, 2010, 08:10:57 PM »
Hi,

I tried to run Simutrans Experimental 7.3 on Linux (Ubuntu) but I keep getting a segfault.

gdb output:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000613063 in display_ddd_box_clip(short, short, short, short, unsigned short, unsigned short) ()

I tried the i386 and amd6 versions, tried deleting the settings.xml, nothing works.

I use the build from http://www.43-1.org/~simutrans/simutrans-exp/i386/

Anyone managed to run STE on Linux ?

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #1 on: April 16, 2010, 08:45:46 PM »
Gyom,

thank you for your report. Judging by your memory address, you are on a 64-bit system, yes? In that case, you should use the 64-bit version. I have not seen any crashes in that method before. Indeed, that is one that is not unique to Simutrans-Experimental (that part of the program is exactly the same as Standard). Have you tried running Standard on Linux?

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #2 on: April 17, 2010, 12:13:47 AM »
on my ubuntu 64-bit system the 32-bit version runs, unlike the 64-bit version.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #3 on: April 17, 2010, 12:58:33 AM »
Hi,

I tried to run Simutrans Experimental 7.3 on Linux (Ubuntu) but I keep getting a segfault.

gdb output:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000613063 in display_ddd_box_clip(short, short, short, short, unsigned short, unsigned short) ()

I tried the i386 and amd6 versions, tried deleting the settings.xml, nothing works.

I use the build from http://www.43-1.org/~simutrans/simutrans-exp/i386/

Anyone managed to run STE on Linux ?


I run it on Linux (Debian) all the time -- but I have a 32-bit system.

Looks like we may have a problem with 64-bit cleanliness in the code.  This is going to be tedious to find because there's lots of slightly sloppy integer type usage.  I'm trying to clean that up, but it will take a long time.

If you could give a full backtrace ('bt' in gdb) it might help.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #4 on: April 17, 2010, 02:19:16 AM »
Code: [Select]
Starting program: ~/simuexp/simutrans-exp-64-2010-04-11-fd32563
[Thread debugging using libthread_db enabled]
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
Preparing display ...
Screen Flags: requested=10, actual=10
Loading font 'font/prop.fnt'
font/prop.fnt sucessfully loaded as old format prop font!
Init done.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000613063 in display_ddd_box_clip(short, short, short, short, unsigned short, unsigned short) ()
(gdb) bt
#0  0x0000000000613063 in display_ddd_box_clip(short, short, short, short, unsigned short, unsigned short) ()
#1  0x000000000048e011 in button_t::zeichnen(koord) ()
#2  0x00000000004d15c7 in gui_container_t::zeichnen(koord) ()
#3  0x00000000004a18b3 in gui_scrollpane_t::zeichnen(koord) ()
#4  0x00000000004d15c7 in gui_container_t::zeichnen(koord) ()
#5  0x00000000004f9be3 in pakselector_t::zeichnen(koord, koord) ()
#6  0x000000000059ad39 in simu_main(int, char**) ()
#7  0x0000000000616664 in main ()

thanks for the howto in:
http://forum.simutrans.com/index.php?topic=4871.msg48080#msg48080

gyom

  • Guest
Re: Experimental on Linux
« Reply #5 on: April 17, 2010, 03:47:42 AM »
Thanks Sdog ! After reading your post I downloaded again the i386 version and indeed it is running fine !
I must have mixed it up while copying with the amd64  :-X
My backtrace for the amd64 is the exact same as Sdog.

Thank you all for your answers !

At last I can try the new version !  :)

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #6 on: April 17, 2010, 08:45:04 AM »
Code: [Select]
#0  0x0000000000613063 in display_ddd_box_clip(short, short, short, short, unsigned short, unsigned short) ()
#1  0x000000000048e011 in button_t::zeichnen(koord) ()
#2  0x00000000004d15c7 in gui_container_t::zeichnen(koord) ()
#3  0x00000000004a18b3 in gui_scrollpane_t::zeichnen(koord) ()
#4  0x00000000004d15c7 in gui_container_t::zeichnen(koord) ()
#5  0x00000000004f9be3 in pakselector_t::zeichnen(koord, koord) ()
#6  0x000000000059ad39 in simu_main(int, char**) ()
#7  0x0000000000616664 in main ()

Hmm, looks like these are compiled without debugging information.  :-(  A build with debugging information would give us *line numbers*.

It's not actually much slower, it just makes the files a bit larger.  To whoever does the automatic builds, could you perhaps build with DEBUG=1 to get us debug information in the official builds of experimental?  It seems to need a lot of debugging, so....

I'm guessing there's heavy inlining going on because there's nothing suspicious in the named routine, but there *is* suspicious stuff in the routines it calls: display_fb_internal and display_vl_internal.   The first thing to try is a recompile with -DUSE_C activated, because I bet that x86 assembly language code has embedded assumptions about the size of int.  If that fails, then the USE_C version probably has embedded assumptions too.

Standard is probably broken on amd64 too.

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #7 on: April 17, 2010, 10:42:35 AM »
To whoever does the automatic builds, could you perhaps build with DEBUG=1 to get us debug information in the official builds of experimental?
Done. Debugging information should be included starting with the next build.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #8 on: April 17, 2010, 12:26:18 PM »
Ahh - I had taken debugging information out on the "official" builds because I wanted a clean release build for people to use that is as optimised as possible, without (for example) lots of assert checks slowing things down. Indeed, I specifically use the #IFDEF DEBUG flag to give additional (and not so user friendly) information in the GUI on occasions. Certainly, the Windows version is compiled as a "release" build in MSVC++.

What, I think, we could really do with is having nightly builds for Experimental on all platforms: the nightly builds would have the debugging information turned on, and the release builds would have it turned off.

As to the original poster's query - if I recall correctly, I think that there are 64-bit Linux builds of Standard. Perhaps some testing could be done of the Standard version to check whether this is a problem there, too? If it is, perhaps this topic needs to be moved out of the "Experimental" section of the board.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #9 on: April 18, 2010, 05:54:22 PM »
Ahh - I had taken debugging information out on the "official" builds because I wanted a clean release build for people to use that is as optimised as possible, without (for example) lots of assert checks slowing things down. Indeed, I specifically use the #IFDEF DEBUG flag to give additional (and not so user friendly) information in the GUI on occasions. Certainly, the Windows version is compiled as a "release" build in MSVC++.

What, I think, we could really do with is having nightly builds for Experimental on all platforms: the nightly builds would have the debugging information turned on, and the release builds would have it turned off.

As to the original poster's query - if I recall correctly, I think that there are 64-bit Linux builds of Standard. Perhaps some testing could be done of the Standard version to check whether this is a problem there, too? If it is, perhaps this topic needs to be moved out of the "Experimental" section of the board.

Agreed.  This would be very helpful.  Could one of the people with AMD64, where experimental is failing, try the "Standard" 64-bit build, so we know whether the problem is here or there?

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #10 on: April 18, 2010, 07:22:27 PM »
as far as i remember it was the same issue in standard.
i'm just waiting for the nightly page to come online again, then i'll try.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #11 on: April 28, 2010, 11:43:20 AM »
Quote
Could one of the people with AMD64, where experimental is failing, try the "Standard" 64-bit build, so we know whether the problem is here or there?

sim-linux64_2010-04-27_v102.3_r3185

runs

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #12 on: April 28, 2010, 01:37:09 PM »
SDog,

thank you very much for the test there. I don't currently have a Linux system on which to test Simutrans-Experimental, so this might be very hard for me to track down. Is anyone with a 64-bit Linux platform and the ability to compile from source able to assist here? If somebody could compile a build with all the debugging information turned on, it would be useful to have a full backtrace.

The difficult thing about this bug is that it appears to occur in code untouched by Experimental (and thus code with which I am entirely unfamiliar). Can anyone with 64-bit Linux confirm whether the 32-bit version runs satisfactorily? Simutrans does not, to my knowledge, benefit in any way from being compiled in 64-bit: indeed, there are no 64-bit Windows builds for this reason, but it has been suggested that a 64-bit build can be more stable on 64-bit Linux than a 32 bit build.

Finally, can anyone confirm whether this bug still occurs with the latest devel branch? If it works in the most recent versions of Simutrans-Standard, it is possible that some recent change to the code in Standard helped to fix the problem that was present earlier - I do recall that there have been some changes to the code recently that deal with platform/architecture issues.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #13 on: April 28, 2010, 10:27:32 PM »
i suppose it's not very urgent, but would still good to track down the problem. if i can help depends mostly on the makefile. can you point me to the github url? (just found it, overlooked it two times.) Give me also some time to get used to git again.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #14 on: April 28, 2010, 11:14:02 PM »
Sdog,

thank you very much for volunteering to help - much appreciated  :-)

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #15 on: May 05, 2010, 03:56:10 PM »
Hi,
"me too" lol. I'm running Gentoo amd64 and the 64bit version of STE segfaults whilst the 32bit version runs fine (ok so far I'm only at the start screen). Do you still need backtraces for this?

I have a little request too - any chance the names of the downloads could be made more informative? Ideal would be a name like "simutrans-experimental-amd64-2010-05-02" (same for the experimental Pak btw, the download filename is without version number). That way it would be much easier to track which version people used when encountering problems.

Here's the MD5 for the version that segfaults:
fd38cb53b873621aeba0c3dd2866ee22  simutrans-exp-latest

And thanks very much for this amazing game :)

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #16 on: May 05, 2010, 04:10:36 PM »
Steffen,

glad that you enjoy Simutrans-Experimental! Ansgar is in charge of the Linux builds, so a request to change the names should be directed to him. As to the crashes - a backtrace would indeed be useful, as it would be good to get a 64-bit version running in case people have problems running the 32-bit version (although note that there is no performance advantage to the 64-bit version as I understand it: if Simutrans is using anywhere near 4Gb of memory, something is wrong in any case).

I should also be interested in your experiences of the stability of the 32-bit version on 64-bit Linux, to see whether a 64-bit version really is needed at all.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #17 on: May 06, 2010, 06:27:05 PM »
I have a little request too - any chance the names of the downloads could be made more informative? Ideal would be a name like "simutrans-experimental-amd64-2010-05-02" (same for the experimental Pak btw, the download filename is without version number). That way it would be much easier to track which version people used when encountering problems.
Actually, if you poke around a little, you'll find that the files are already available with dates.  simutrans-experimental-latest is a link to the dated one.

Perhaps we should simply point people to

http://www.43-1.org/~simutrans/simutrans-exp/i386/
http://www.43-1.org/~simutrans/simutrans-exp/amd64/

And ask them to get the files with the latest dates?

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #18 on: May 07, 2010, 03:09:54 PM »
Starting with the next build (simutrans|makeobj)-exp-latest should redirect to the files including a date.  This way the browser should save the files that way as well.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #19 on: May 07, 2010, 09:36:43 PM »
Starting with the next build (simutrans|makeobj)-exp-latest should redirect to the files including a date.  This way the browser should save the files that way as well.
Brilliant, thanks :)

[snip] As to the crashes - a backtrace would indeed be useful, as it would be good to get a 64-bit version running in case people have problems running the 32-bit version (although note that there is no performance advantage to the 64-bit version as I understand it: if Simutrans is using anywhere near 4Gb of memory, something is wrong in any case).

I should also be interested in your experiences of the stability of the 32-bit version on 64-bit Linux, to see whether a 64-bit version really is needed at all.
Well so far the 32bit version is running fine for me. I did have reproducable segfaults when making a map with a large number of cities with large numbers of inhabitants but I want to a) use the git-version and b) track it down a bit more precisely before I report that.

As for performance, I agree on the 4GB issue, but doesnt x64 also have substantially more registers (and bigger ones at that) whilst disposing of at least a little bit of the ancient cruft that we carry around since the 8086? I'm no expert with C/C++ but I thought the compiler takes care of this more or less automatically so my gut feeling would be that making simutrans 64bit-safe would be good. Also there are some 64bit arches that do not have hardware support for running 32bit code. And whilst 4GB for a single program may seem obscene now, I think in a few years we'll look at the issue differently. So my vote, realising that I don't get one, goes to continuing the effort for a 64bit version :)
Stand by for the backtrace, this game is just so addictive..

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #20 on: May 08, 2010, 02:10:37 PM »
Steffen,

ahh, yes, it would be good to make it 64-bit compatible for portability, if nothing else. It's not my main priority at present (and it's hard for me to test, as I only have a 32-bit machine), but if you (or anyone else) can find the problem and propose a sensible solution, I'll happily fix it :-)

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #21 on: May 08, 2010, 06:57:14 PM »
We definitely want simutrans to be 64-bit clean.  The thing is, I have no idea what part isn't 64-bit-clean.  Nothing is jumping out at me, and I don't have a 64-bit machine to test on.  There are all kinds of sloppy integer conversions throughout the code, and sometime I'm going to go through and clean up as many as I possibly can, but that's going to take a long time, so if we can find the actual cause of the problem, that would be best.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #22 on: May 09, 2010, 07:11:31 PM »
Ok it seems the self-compiled version runs just fine. I confirmed with file that its a 64 bit version.
Can you confirm that the 2010-04-11-fd32563 is the same as the latest git from http://github.com/jamespetts/simutrans-experimental (last date of change is April10, the commit ID conspicuosly starts with fd32563)? I'm really confused.. does anyone have any ideas?

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #23 on: May 09, 2010, 07:14:11 PM »
Are you compiling from the -devel branch or the -master branch?

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #24 on: May 09, 2010, 08:45:29 PM »
Good question, I never told git specifically so I would assume its the master branch. Here's the command I used: git clone http://github.com/jamespetts/simutrans-experimental.git
Which branch is used to create the automatic builds?

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #25 on: May 09, 2010, 09:34:30 PM »
The Master branch is used to create the automatic builds - if you are building the master branch, you will get Simutrans-Experimental 7.3.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #26 on: May 09, 2010, 10:44:43 PM »
Then I'm really confused, how can it be that the auto-build segfaults but my own build works? What exact settings (and GCC and library versions) are used to make the autobuild, I could try rebuilding with those to try and reduce the number of possible causes.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #27 on: May 09, 2010, 11:55:34 PM »
This is indeed odd. Perhaps Ansgar can assist...?

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #28 on: May 10, 2010, 03:40:00 AM »
The autobuild is done in Debian Lenny which has these libraries:

libsdl1.2-dev 1.2.13-2
libsdl-mixer1.2-dev 1.2.8-4
zlib1g-dev 1:1.2.3.3.dfsg-12
libpng12-dev 1.2.27-2+lenny2
libbz2-dev 1.0.5-1
gcc 4:4.3.2-2

And this config.simutrans-exp.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #29 on: May 10, 2010, 04:36:17 AM »
i tried to compile devel on a x86_64

pulled it that way:
$ git pull <url> devel

Code: [Select]
$gcc --version
gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3

i'm using a config.default similar to ansgars (thanks your post helped me a lot)
same flags, no ccache,

it failed at dataobj/einstellungen.o
Code: [Select]
dataobj/einstellungen.cc:979: error: cast from ‘const char*’ to ‘uint32’ loses precision
changed that to uint64, that compiles
next error at
Code: [Select]
===> CXX simgraph16.cc
simgraph16.cc: In function ‘void rezoom_img(image_id)’:
simgraph16.cc:1290: warning: comparison between signed and unsigned integer expressions
simgraph16.cc: Assembler messages:
simgraph16.cc:2370: Error: suffix or operands invalid for `jmp'
make: *** [simgraph16.o] Error 1

(continued)
that's the same point where main branch didn't compile for me last time i tried.
it looks quite like asembler to me, i have quite some problems to read c++ already, but here i'm completely lost.

google is my friend, someone else found that too:
http://bugs.gentoo.org/show_bug.cgi?id=291285
didn't really help me though...

seems like prissi can:
http://archive.forum.simutrans.com/topic/06875.0/index.html

and i overlooked it in ansgars config.default
Code: [Select]
ifneq ($(shell dpkg-architecture -qDEB_BUILD_ARCH),i386)
  FLAGS += -DUSE_C
endif
i read if equal, and ignored fillilly ignored it.
so now, compiling with += -DUSE_C

which works

minor interruption, to get libbz2-dev
linked without error, testing now

it runs
no errors

that's enough for me today, it's late and i'll go chopping* some heads off in mount and blade,
please let me know what further testing of the binary you need.


here's the binary and the log file for download:
http://dl.dropbox.com/u/1876190/simexp-64-8.0-devel
http://dl.dropbox.com/u/1876190/simexp-64-8.0-devel.log


*i should better use the passive  here, get my head chopped off, still relaxing.


update
couldn't resist trying a bit and caused this:
Code: [Select]
Program received signal SIGSEGV, Segmentation fault.
0x000000000059e72c in vector_tpl<koord>::remove_at (this=0x7fffffffb910,
    pos=0, preserve_order=false)
    at boden/wege/../../dataobj/../tpl/vector_tpl.h:213
213 data[pos] = data[count-1];

error is reproduceable, just loaded my savegame (the one i uploaded previously), and waited.
gdb output:
http://dl.dropbox.com/u/1876190/gdb.txt

another update
most of the problems here have been discussed in detail in this thread, only a few days ago:
http://forum.simutrans.com/index.php?topic=5066.msg49722#msg49722
« Last Edit: May 11, 2010, 03:04:34 AM by sdog »

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #30 on: May 11, 2010, 03:01:56 AM »
Then I'm really confused, how can it be that the auto-build segfaults but my own build works? What exact settings (and GCC and library versions) are used to make the autobuild, I could try rebuilding with those to try and reduce the number of possible causes.
perhaps ccache causes the problem?

this bug "ccache sometimes returns 32bit objects to a 64bit build" is rather old, but still worth to check ccache?
http://bugs.gentoo.org/show_bug.cgi?id=196243



Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #31 on: May 11, 2010, 05:51:22 AM »
Good idea but I don't have ccache installed so that can't be it :(
I'm not sure if you understood me right, my own build appears to work fine (it loads with the experimental britain pak, it loads my save game originally made with the i386 autobuild, I tried running it a couple of minutes) but the autobuild segfaults before it even asks me which pak I want to use.
So we can exclude my run-environment, the actual code and the pak as sources of the problem with the autobuild. (I refer to the problem that it doesnt even launch, the problem in vector-tpl.h

Now of the top of my head I can think of these remaining suspects:
- lack of DEBUG in the 04-11 autobuild causes the problem
- some other difference between sdog's and mine vs. the autobuild config causes it
- the versions of GCC/libraries used in autobuild cause it
- the versions of GCC/libs used during compile is incompatible with the versions used by sdog and me.

Not knowing much of C the last two are my favourites right now, so I'm emerging the old libraries.

sdog: I think it might be worth opening a new thread for the segfault you discovered during running your own build. I think it's save to assume that that is a different fault then whatever is causing the autobuild to segfault on startup.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #32 on: May 11, 2010, 06:20:45 AM »
Good idea but I don't have ccache installed so that can't be it :(
No, no, the autobuild might have ccache installed, and that may be causing breakage there.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #33 on: May 11, 2010, 06:33:03 AM »
No, no, the autobuild might have ccache installed, and that may be causing breakage there.
Ah ok, yes that could be a cause. But the bug linked appears to be specific to Gentoo's handling of having two-in-one platforms, and was marked fixed (by update to an eclass, which are again Gentoo-specific) more than a year ago. But I guess until proven otherwise we should consider that an option.

Ansgar: Is ccache active on the autobuild system?

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #34 on: May 11, 2010, 06:54:54 AM »
it failed at dataobj/einstellungen.o
Code: [Select]
dataobj/einstellungen.cc:979: error: cast from ‘const char*’ to ‘uint32’ loses precision
changed that to uint64, that compiles
Is this bug still happening with -DUSE_C?  If so please rerun it with the latest devel and give me a new line number, since the old one seems a bit out of date.  This is the sort of thing I should be able to solve.

Quote
update
couldn't resist trying a bit and caused this:
Code:
Code: [Select]
Program received signal SIGSEGV, Segmentation fault.
0x000000000059e72c in vector_tpl<koord>::remove_at (this=0x7fffffffb910,
    pos=0, preserve_order=false)
    at boden/wege/../../dataobj/../tpl/vector_tpl.h:213
213 data[pos] = data[count-1];

error is reproduceable, just loaded my savegame (the one i uploaded previously), and waited.
Oh hell, I found a logic bug in vector_tpl in experimental -- nasty off-by-one error.  Don't roll your own container classes, folks!

Patch is on the jp-devel branch of my git repo (git://github.com/neroden/simutrans)

If that doesn't fix it, or if the bug is in standard (which doesn't contain the off-by-one error) then I need two pieces of information from gdb: the output of 'print count' and 'bt'.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #35 on: May 11, 2010, 07:12:32 AM »
Is this bug still happening with -DUSE_C?  If so please rerun it with the latest devel and give me a new line number, since the old one seems a bit out of date.
it still happens with -DUSE_C.
Code: [Select]
dataobj/einstellungen.cc:982: error: cast from ‘const char*’ to ‘uint32’ loses precision

Quote
h hell, I found a logic bug in vector_tpl in experimental -- nasty off-by-one error.  Don't roll your own container classes, folks!

Patch is on the jp-devel branch of my git repo (git://github.com/neroden/simutrans)

If that doesn't fix it, or if the bug is in standard (which doesn't contain the off-by-one error) then I need two pieces of information from gdb: the output of 'print count' and 'bt'.
I pulled vector_tpl from your branch, changed eintellungen.cc:982 to unit64 and compiled it.
i'll let it run on fast time, and go a way a while.

It didn't crash. You must have fixed it.

The game froze however, after playing around in the depot a bit. No input possible, date stuck at 1 February, news ticker in bottom panel continued to run.

killed it with ^c
Code: [Select]
^C
Program received signal SIGINT, Interrupt.
0x000000000066b8e1 in colorpixcopy (dest=0x891c75e, src=0x11a8754, end=0x11a877e) at simgraph16.cc:1932
1932 while (src < end) {
(gdb) bt
#0  0x000000000066b8e1 in colorpixcopy (dest=0x891c75e, src=0x11a8754, end=0x11a877e) at simgraph16.cc:1932
#1  0x000000000066c3f7 in display_color_img_aux (sp=0x11a8730, x=389, y=471, h=19) at simgraph16.cc:2526
#2  0x000000000066cb58 in display_base_img (n=1638, xp=341, yp=410, player_nr=0 '\000', daynight=0, dirty=1)
    at simgraph16.cc:2642
#3  0x00000000004bfa79 in gui_image_list_t::zeichnen (this=0x7899bf8, parent_pos=...)
    at gui/components/gui_image_list.cc:99
#4  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x789a918, offset=...) at gui/gui_container.cc:123
#5  0x00000000004c4e2f in gui_scrollpane_t::zeichnen (this=0x7899d18, pos=...)
    at gui/components/gui_scrollpane.cc:148
#6  0x00000000004c63df in gui_tab_panel_t::zeichnen (this=0x78993a8, parent_pos=...)
    at gui/components/gui_tab_panel.cc:121
#7  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x78991f0, offset=...) at gui/gui_container.cc:123
#8  0x00000000004b6227 in gui_convoy_assembler_t::zeichnen (this=0x78991f0, parent_pos=...)
    at gui/components/gui_convoy_assembler.cc:518
#9  0x00000000004fe3a9 in gui_container_t::zeichnen (this=0x7898948, offset=...) at gui/gui_container.cc:123
#10 0x00000000004ffed5 in gui_frame_t::zeichnen (this=0x7898940, pos=..., gr=...) at gui/gui_frame.cc:166
#11 0x00000000004e5d7a in depot_frame_t::zeichnen (this=0x7898940, pos=..., groesse=...) at gui/depot_frame.cc:613
#12 0x0000000000616011 in display_win (win=0) at simwin.cc:647
#13 0x00000000006160b3 in display_all_win () at simwin.cc:670
#14 0x0000000000617843 in win_display_flush (konto=223383.38) at simwin.cc:1023
#15 0x00000000005dd956 in intr_refresh_display (dirty=false) at simintr.cc:76
#16 0x00000000006278b0 in karte_t::sync_step (this=0xeb1960, delta_t=133, sync=false, display=true)
    at simworld.cc:2792
#17 0x00000000005dda49 in interrupt_check (caller_info=0x6a0c81 "simfab 636") at simintr.cc:101
#18 0x00000000005c3fa3 in fabrik_t::step (this=0x7355440, delta_t=100) at simfab.cc:1068
#19 0x0000000000629d14 in karte_t::step (this=0xeb1960) at simworld.cc:3419
#20 0x0000000000633e46 in karte_t::interactive (this=0xeb1960, quit_month=2147483647) at simworld.cc:5725
---Type <return> to continue, or q <return> to quit---
#21 0x00000000005e6633 in simu_main (argc=1, argv=0x7fffffffe2e8) at simmain.cc:1075
#22 0x00000000006713e3 in main (argc=1, argv=0x7fffffffe2e8) at simsys_s.cc:748


ccache is used for the auto builds, according to the config.default Ansgar posted. Perhaps it's worth a try for him to compile without ccache?
« Last Edit: May 11, 2010, 07:33:20 AM by sdog »

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #36 on: May 11, 2010, 01:15:44 PM »
Ansgar: Is ccache active on the autobuild system?

Yes, but I configured ccache to use a different cache directory for amd64 and i386 (IIRC it didn't work at all before).  I just cleaned the cache anyway.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #37 on: May 11, 2010, 02:44:03 PM »
it still happens with -DUSE_C.
Code: [Select]
dataobj/einstellungen.cc:982: error: cast from ‘const char*’ to ‘uint32’ loses precision
This was a silly coding error in experimental.  I just fixed it on my jp-devel branch.  I'm surprised it worked on 32-bit machines, as it shouldn't have!

The lockup is going to be harder to debug; the place you stopped it is working just fine (as proved by the fact that the ticker keeps running, in a single-threaded program).
« Last Edit: May 11, 2010, 02:50:54 PM by neroden »

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #38 on: May 11, 2010, 09:12:03 PM »
Knightly,

I'm probably the silly coder in this case. I'd be very grateful if you could release your updated version; although performance is slower, I find it acceptable enough, and I have made it optional in Experimental in any case for those who don't like the extra time that it takes to generate a map.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #39 on: May 11, 2010, 10:52:45 PM »
Ok it seems I found what causes the segfault on startup with the autobuild. It's either OPTIMISE = 1 or -DNEW_PATHING.
Here's the backtrace for a start, I'll try which of these the final culprit is now and then try it with devel. I realise now that I should've tried devel first, didn't take into account that these flags can influence which code is compiled *sigh*

Program received signal SIGSEGV, Segmentation fault.
0x000000000062997b in display_fb_internal (xp=<value optimized out>, yp=<value optimized out>, w=<value optimized out>, h=88,
    color=<value optimized out>, dirty=<value optimized out>, cL=<value optimized out>, cR=<value optimized out>, cT=0,
    cB=<value optimized out>) at simgraph16.cc:2959
2959                                    *lp++ = longcolval;
(gdb) bt
#0  0x000000000062997b in display_fb_internal (xp=<value optimized out>, yp=<value optimized out>, w=<value optimized out>, h=88,
    color=<value optimized out>, dirty=<value optimized out>, cL=<value optimized out>, cR=<value optimized out>, cT=0,
    cB=<value optimized out>) at simgraph16.cc:2959
#1  0x0000000000629aee in display_fillbox_wh (xp=44, yp=176, w=20338, h=1, color=<value optimized out>, dirty=-135966870)
    at simgraph16.cc:2990
#2  0x00000000004d6c1b in gui_frame_t::zeichnen (this=<value optimized out>, pos=..., gr=<value optimized out>) at gui/gui_frame.cc:155
#3  0x00000000004fe59c in pakselector_t::zeichnen (this=0x2c, p=..., gr=<value optimized out>) at gui/pakselector.cc:64
#4  0x00000000005aa74e in ask_objfilename (argc=1, argv=<value optimized out>) at simmain.cc:247
#5  simu_main (argc=1, argv=<value optimized out>) at simmain.cc:644
#6  0x000000000062dc67 in main (argc=1, argv=0x7fffffffdac8) at simsys_s.cc:743
(gdb) quit

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #40 on: May 11, 2010, 10:57:59 PM »
Steffen,

the "NEW_PATHING" preprocessor directive has long been deprecated, so that can be eliminated as a possible cause. References to it should be removed. I should also note that none of those parts of the code, so far as I am aware, have had any modifications for Experimental.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #41 on: May 11, 2010, 11:06:32 PM »
yes you're right, I found that activating NEW_PATHING lets the program run fine, whilst OPTIMISE breaks it. Now I looked in the Makefile and the only conclusion I can draw is that it's a toolchain bug. If anyone agrees I'll head over to GCC and post a bugreport.

ansgar: in the meantime, can you deactivate OPTIMISE for the amd64 autobuild?

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #42 on: May 11, 2010, 11:12:36 PM »
i thought i compiled it with optimisation. since i'm not on the same machine now, i can't check wich level. -O3 would be the most likely however.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #43 on: May 11, 2010, 11:31:30 PM »
What version of GCC are you using?

Offline ansgar

  • *
  • Posts: 80
Re: Experimental on Linux
« Reply #44 on: May 12, 2010, 12:13:28 AM »
yes you're right, I found that activating NEW_PATHING lets the program run fine, whilst OPTIMISE breaks it. Now I looked in the Makefile and the only conclusion I can draw is that it's a toolchain bug. If anyone agrees I'll head over to GCC and post a bugreport.
More likely is a bug in Simutrans.  An optimizer may make certain assumptions about the code, but a program could be written in a way that these are not true.  In that case it will likely crash or give wrong results.

Quote
ansgar: in the meantime, can you deactivate OPTIMISE for the amd64 autobuild?
Sure.

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Re: Experimental on Linux
« Reply #45 on: May 12, 2010, 03:48:45 AM »
Knightly,

I'm probably the silly coder in this case. I'd be very grateful if you could release your updated version;
James, I think I'm the one who made the 'silly coding error' comment, and you've already fixed it by pulling from my jp-devel branch.  Isn't git wonderful?  (Says the new convert who wouldn't use it two months ago!)

More likely is a bug in Simutrans.  An optimizer may make certain assumptions about the code, but a program could be written in a way that these are not true.  In that case it will likely crash or give wrong results.

One example which would work differently in 64-bit and 32-bit arithmetic is bit-twiddling with & and |  -- the bitmasks might be right for 32-bit, wrong for 64-bit.  There are a bunch of other things like this.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #46 on: May 12, 2010, 04:10:30 AM »
This reminds me why I use Python for my own programs lol.
In any case I found another reproducable segfault, but I'll post it as a new thread since it appears to be a different issue.

And yes, git is brilliant :)

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #47 on: May 12, 2010, 05:37:14 AM »
python...  to-morrow i have to look at my fortran 77 code all day long.

you let me dream of tropical paradises with pythons and girls wearing perls and rubies, and certainly weren't named ada.


i'm pretty thankfull james decided to use git, having to learn svn is something i gladly pass for more joyfull things. well, about twenty million less joyfull things spring to my mind though. all are related to horrible deaths, root canal treatments, north-american coffee or having smalltalk with ada.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #48 on: May 19, 2010, 02:00:09 AM »
since i don't want to cp the file after building to my simutrans dir, i just put a symlink on the build in the development directory. it doesn't work. simutrans did not look in my pwd for the pakfile, and exited when i couldn't find one.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18745
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Experimental on Linux
« Reply #49 on: May 19, 2010, 10:18:16 AM »
Sdog,

I don't think that this is a Simutrans-Experimental specific issue.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9568
  • Languages: De,EN,JP
Re: Experimental on Linux
« Reply #50 on: May 19, 2010, 02:34:21 PM »
Simutrans uses the programm directory, where you copied it. If it should use the current directory instead, it must be called with "-use_workdir", see the readme.

Offline sdog

  • Devotee
  • *
  • Posts: 2039
Re: Experimental on Linux
« Reply #51 on: May 19, 2010, 03:27:00 PM »
thanks prissi, i already expected i made a mistake when compiling


@james, yes it isn't but it is to unimportant to warrant a new thread.

Offline steffen

  • *
  • Posts: 78
Re: Experimental on Linux
« Reply #52 on: May 28, 2010, 10:13:54 PM »
Ok well I definitely can't reproduce that segfault I was getting anymore, either an update or my world recompile fixed it :)