I find that a lot of new CUDA developers have this tendency to use the CUDA sdk makefiles (include common.mk), and having loads of dependencies on the cuda utils that the sdk provides. This, in the long run does not seem like a good idea, as it involves depending on what the sdk makefile does, and hence depending on the (heavy) sdk itself. Here is a small tutorial on how to compile stand alone CUDA program with multiple h, cpp and cu files, and a few external headers/libs:
Lets say we have 4 files: main.cpp, test.cpp, test.h, kernel.cu
The idea is that we have to compile the cpp files with g++, and the cu files with nvcc.
Also, we specify that only compilation should take place (no linking), so that if we have any dependencies between files, we won't get zillions of errors.
To do this, compile the cpp files this way:
g++ -c *.cpp
Then compile the cu files:
nvcc -c *.cu
Note that nvcc may not be in your default path. When you install the CUDA toolkit, it is placed in the cuda/bin/nvcc path. For example if you installed cuda toolkit in /usr/local/cuda, then nvcc is in /usr/local/cuda/bin/nvcc. Adding this to your $PATH variable will fix things.
The above steps should produce 3 object files: main.o, test.o, and kernel.o
Finally, link these together and make the final executable:
g++ -o runme *.o
This will link all the object files into a single executable called "runme", which you can run as usual.
To include any specific headers / libraries, simply pass them as arguments during the compile / link phases. For example:
g++ -c -I/usr/local/cuda/include *.cpp
g++ -o runme -L/usr/local/cuda/lib -lcuda -lcudart *.o
While running CUDA programs, you may get a library not found error. To fix this, add the CUDA libraries to your $LD_LIBRARY_PATH environment variable. The libraries are libcuda.so, libcudart.so and will be present in your CUDA toolkit install path.
All this can be condensed into a nice makefile for convenience.
Finally, the part about actually writing CUDA programs that are independent of the SDK. Note that the SDK uses macros like CUT_DEVICE_INIT() which are in practice not needed at all. Simply include the cuda header files, and start making cuda calls.
Subscribe to:
Post Comments (Atom)
Hi:
ReplyDeleteI tried to follow your idea, but no success :(
I have a large program with .cpp and .h distributed across different folders, and inside a class defined in a .h file, I placed a function, that function is declared in a .cu file.
I obtained the .o files from the .cpp files with no problem.
I obtained the .o files from the .cu fiiles with no problems.
However, when linking, all hell breaks hole. The number of errors is bigger than what a screen can contain. It seems that it does not recognize the cutil functions and worse: it says that many of my classes definitions are wrong! (which are not since my code works perfectly when run on a CPU).
So, please, if you have faced this problem, I would be glad if you can share your experiences.
Hi Daniel,
ReplyDeleteYou can make sure CUDA (tries to) use C++ compilation by passing the "--host-compilation=c++" flag to nvcc, if you are using an older version of the cuda toolkit (new versions do this by default).
Also, I wonder why you are using cutil. Anyway, are you specifying the paths for cutil libraries while linking? I see a whole bunch of libs in my SDK install path: "NVIDIA_GPU_Computing_SDK/shared/lib/linux". You may want to include them in your linking process like this:
g++ -L (cuda-sdk-path)/shared/lib/linux/ -L/usr/local/cuda/lib -lcuda -lcudart *.o
(of course, you should replace (cuda-sdk-path) with the actual path to your cuda sdk)
Further, to make sure any functions you define in the '.cu' file are visible to the cpp files, you need to declare them as extern "C". Here is an example:
extern "C" void generate_photon_map(float4* buffer)
{
....
}
Hope this helps,
kashyap
Thanks for your quick reply. But it doesn't work. I got same errors. (aside from lcutil, which I solved thanks to you)
ReplyDeleteWhat I did is to take the cppIntegration project, and tried to apply your steps. It DOES NOT work!
It shows problems with 'multiple definitions' of 'kernel' and 'kernel2' which are CUDA functions kernels defines with __host.
So, please, if you can replicate the same experiment (it should take less than 5 minutes) and let me know what happens, I would be glad.
Daniel.
hmm, I tried this, and my makefile looks something like this:
ReplyDelete=================================
SDKPATH := ~/NVIDIA_GPU_Computing_SDK
CUDAPATH := /usr/local/cuda
LDFLAGS := -L$(CUDAPATH)/lib -L$(SDKPATH)/shared/lib/linux -L$(SDKPATH)/C/lib -lcutil -lcudpp -lcuda -lcudart
CXFLAGS := -I$(CUDAPATH)/include -I$(SDKPATH)/shared/inc -I$(SDKPATH)/C/common/inc
CXX := g++
NVCC := $(CUDAPATH)/bin/nvcc
$(EXECUTABLE): cppIntegration.o main.o cppIntegration_gold.o
$(CXX) $(LDFLAGS) -o $(EXECUTABLE) cppIntegration.o main.o cppIntegration_gold.o
cppIntegration.o: cppIntegration.cu
$(NVCC) $(CXFLAGS) -c cppIntegration.cu
main.o: main.cpp
$(CXX) $(CXFLAGS) -c main.cpp
cppIntegration_gold.o: cppIntegration_gold.cpp
$(CXX) $(CXFLAGS) -c cppIntegration_gold.cpp
=====================================
Does this work for you?
oops, add this line to the top:
ReplyDeleteEXECUTABLE := cppIntegration
Thanks a lot!
ReplyDeleteActually, I discovered a mistake: I was creating the .o file for the file cppIntegration_kernel. According to some forums, that was wrong, since the functions inside are created twice and then the linker gets confused.
I see that in your code you didn't create that .o file, that's why it works (I tested it!)
Now, back to my large problem:
Well, the problem with the linker still exist. The number of errors are far fewer though:
I get errors like:
"multiple definition of "
The thing is that they come from the .o file of my cpp files, created with g++ (not with nvcc)
I believe the compiler (g++) is declaring the functions twice and then the linker gets confused. But I don't know how to correct that. If you ever faced this problem, just let me know.
Thanks for your help,
Daniel.
Ok, I think the reason you get multiple definition errors is that you are compiling the same stuff twice. Here is an example:
ReplyDeletelets say we have 2 files: A, and B
Inside B, we have #include "A"
If you compile both A and B, you will now get an error because the stuff in A gets compiled twice.
This is what was happening with the _kernel file. It was included inside cppIntegration.cu (I think)
The way to fix this is:
1. Enclose your header files inside #ifdef ... #endif macros, so they are not included twice:
example header file (thisheader.h):
#ifndef THISHEADER_H
#define THISHEADER_H
void blahblah();
#endif
2. Do not include variable declarations and function Definitions in include files. Put only function prototypes in include files, and put the definitions in cpp files.
Of course, you may already know this, but I was just stating my experiences with linker problems...
I did apply your suggestion (1), but same errors. Your suggestion (2) is the one that I finally applied, but with a difference:
ReplyDeleteI didn't modify the header files, in which there are multiple variable and class definitions. What I did is modifying my .cu files so that they don't deal with my custom types/classes. And now it works.
However, what bugs me is that it seems that I CAN NOT use custom types/classes defined for the cpp files in my .cu files. Is that true? Or is there a way around?
Thanks for your responses.
Daniel.
hi, its a bit hard doing this over the comments section :P
ReplyDeletecan you send me the code so I can try it?
sriramkashyap AT_THE_RATE_GMAIL_DOT_COM
sorry, but i dont like spam bots very much :)
Even otherwise, classes can be used to a limited extent in cu files... you can try specifying the nvcc option "--host-compilation=c++".
Using C++ classes in cu file is not a problem as long as your device and kernel functions dont have to deal with them. even otherwise, i have been able to use templates and operator overloading in device functions.
Hi Kashyap,
ReplyDeleteCan you send me your running code sample. This is very important for me.
abeynk[AT]gmail[DOT]com
hello, thanks to your post, is very useful..
ReplyDeleteanyway, i'm pretty new with cuda coding, and i'm just trying to understand which libraries are useful and how the code flow works.
Do u mind if i ask u to send me a simple project with a few source files and the makefile? i want to try to compile and link it with your method, and see how to create the makefile.
I'm on a university project and it's very important for me.
gianpiero.gibiino[AT]gmail[DOT]com
Hi,
ReplyDeleteI am beginner in CUDA and this was great help.
In my case I had to use the 64 bit version of the libraries, though.
Thank you for your entry blog
Thanks for the Makefile Kashyap. But I had to make a small correction before it worked for me.
ReplyDeleteI had to move the "-lcutil -lcudapp -lcuda -lcudart" part to the linking command:
$(CXX) $(LDFLAGS) -o $(EXECUTABLE) cppIntegration.o main.o cppIntegration_gold.o -lcutil -lcudapp -lcuda -lcudart
Odd, $(LDFLAGS) should take care of that. It contains all the libraries...
ReplyDeleteI tried this with three files: Main.cpp KernelWrapper.cu and MyKernel.cu
ReplyDeleteWhen I create objects from the cu files with nvcc like this:
nvcc -c KernelWrapper.cu MyKernel.cu
I get the two object files KernelWrapper.obj and MyKernel.obj.
I I use nvcc to to do the final linking like this:
nvcc -o runme Main.cpp KernelWrapper.obj MyKernel.obj,
it works fine!
But if I use g++ for the last step it gives me pages of errors including many strange looking "undefined reference" errors.
I hoping it will be possible to use g++ in the final step, because Main.cpp in my little test is a stand-in for a lot more c++ code (which uses STL).
nvcc is not so good with STL I read somewhere, and besides that, I have a Makefile for the existing code which I would like to reuse with as few changes as possible.
My hope was, that I good compile all cuda code with nvcc and then pass the objects to g++. Should'nt this be possible?
I'm using Win7 and minGW.
The Main.cpp file is very simple - it calls a function in the kernelwrapper and then sends text to cout.
nvcc expects only one cu file. All other cu files should be #included into that one.
ReplyDeleteSo in the end, you should compile only the main cu file using nvcc. Also, try to do all compiles (-c) first, even with g++, and for final linking step, pass only object files to g++. So here is what you do (assuming KernelWrapper.cu is your main cu file):
nvcc -c KernelWrapper.cu
g++ -c Main.cpp
g++ -o runme Main.o KernelWrapper.obj
hope this works.
Hi and thanks for the answer.
ReplyDeleteI still have problems though.
I compile the cu file. I have now moved the kernel method into this file, so MyKernel.cu is no longer needed.
nvcc -c KernelWrapper.cu
I then compile the Main.cpp with g++.
but when I get to the linking I get a lot of undefined references.
I try to link like this:
g++ -o runme Main.o KernelWrapper.obj -L"D:\CUDA\v3.2\lib\Win32" -lcuda -lcudart
but I get a whole bunch of undefined reference errors!
A very few of them are:
KernelWrapper.obj:(.text+0x32): undefined reference to `?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A'
KernelWrapper.obj:(.text[??6?$basic_ostream@DU?$char_traits@D@std@@@std@@QAEAAV01@H@Z]+0x19): undefined reference to `__security_cookie'
KernelWrapper.obj:(.text[?_Decref@facet@locale@std@@QAEPAV123@XZ]+0xf): undefined reference to `??0_Lockit@std@@QAE@H@Z'
KernelWrapper.obj:(.text[?clear@ios_base@std@@QAEXH_N@Z]+0x63): undefined reference to `_CxxThrowException@8'
I think there are almost 100 of these errors.
Also one for Main.cpp:
Main.o:Main.cpp:(.text+0xf): undefined reference to `RunTest()'
I include cuda_runtime.h in KernelWrapper.cu and as far as I can see it looks like it can find the cuda and cudart objectfiles just fine.
Do you have a suggestion about what could be wrong here?
Looks like some basic libraries are not being used while linking, so your standard c++ stuff (like cout) are not being resolved.
ReplyDeleteWhy this is happening, I don't know. But the solution should be to explicitly link with stdc++ library:
g++ -lstdc++ -o runme Main.o KernelWrapper.obj
(If the above hack does not fix things, read below)
ReplyDeleteBy the way, I just noticed the errors are in KernelWrapper.obj... I hope you are not using (much) c++ in the .cu file... nvcc is not too good with c++ stuff (well, it was not good 6 months ago, dunno how it is now).
If all you want to do is print stuff in the .cu file, try using printf. Maybe somebody else knows a better way of doing this (some googling may help).
Thanks you very very much :-)
ReplyDeleteI got it working by using printf's in KernelWrapper.cu instead of cout's.
I get a few warnings, but know I'm less worried :-)
Hi, I am having the same problems, as Jens above: Under windows, I would like to create a cuda library using nvcc, and link it together with g++-made objects from cpp. I am getting similar errors like him. And I just cant get to work my code.
ReplyDeleteJens, if you read this, please send me the solution, I would really appreciate a little example program, because I can't find the solution since days ago.
kottalovag[AT]gmail[DOT]com
@KottaLovag: Did both the above solutions not help you? (Using printf instead of cout, and explictly linking with libstdc++) ?
ReplyDeleteMaybe you can share a few lines of the errors you see.
Thank you for the quick answer!
ReplyDeleteI created a minimal representation of my problem and uploaded here: http://www.megafileupload.com/en/file/316921/cuda-problem-example-zip.html
I have win7 x64, cuda, mingw installed
I have visual studio as well, but i dont want to use that, because I am planning to do this with Qt later.
As you can see in the uploaded example (or below), I would like to create a statical library for the cuda-part of my application. This can be done with the MakeStatlib.bat, which's first call is to setup the environment for the visual compiler (I read this in the nvcc manual)
Then, I would like to create a runnable app, which would call the call() function which takes place in cudalib.lib
The content of the uploaded example:
files:
kernel.cu
caller.cu
kernel.h
caller.h
main.c
MakeStatlib
MakeStatlib.bat
MakeRunnable
MakeRunnable.bat
in kernel.cu:
__global__ void calculateKernel()
{
}
in caller.cu:
#include /* dim3 */
#include "kernel.h" /* calculateKernel() */
void call()
{
dim3 dimBlock(1,1);
dim3 dimGrid(1,1);
calculateKernel<<< dimGrid, dimBlock >>>();
}
in kernel.h:
#ifndef KERNEL_H
#define KERNEL_H
__global__ void calculateKernel();
#endif /* KERNEL_H */
in caller.h:
#ifndef CALLER_H
#define CALLER_H
void call();
#endif /* CALLER_H */
in main.c:
#include "caller.h" /* call() */
int main()
{
call();
return 0;
}
in MakeStatlib:
cudalib.lib: kernel.obj caller.obj
nvcc -lib kernel.obj caller.obj -o cudalib.lib
kernel.obj: kernel.cu kernel.h
nvcc -c kernel.cu -o kernel.obj
caller.obj: caller.cu caller.h kernel.h
nvcc -c caller.cu -o caller.obj
in MakeStatlib.bat:
call "c:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\Tools\vsvars32.bat"
mingw32-make -f MakeStatlib
in MakeRunnable:
app: main.o cudalib.lib
gcc main.o cudalib.lib -o app
main.o: main.c caller.h
gcc -c main.c -o main.o
cudalib.lib: kernel.cu caller.h caller.cu
MakeStatlib.bat
in MakeRunnable.bat:
mingw32-make -f MakeRunnable
PAUSE
I forgot to mention the problem:
ReplyDeleteI can successfully create cudalib.lib, but unfortunately cannot link it together with main.o, the linker says:
gcc main.o cudalib.lib -o app
main.o:main.c:(.text+0xc): undefined reference to 'call'
collect2: ld returned 1 exit status
The other thing to mention is that the blog hides "<"cuda">" when including it, thinking of it as a tag :)
Sorry I can't actually try out your code, since I don't have the setup necessary to run this. Also, my experience in this matter is limited to the linux platform. But since this seems to be a regular linking issue, have you tried using something like this to link:
ReplyDeletegcc main.o -static -L. -lcudalib.lib -o app
You can also add the '-v' argument to gcc to see a verbose dump that may reveal the problem.
@Kashyap:
ReplyDeleteThank you for the advice, I tried it.
for
-L. cudalib
-L. -lcudalib.lib
-L. -llibcudalib
-L. -llibcudalib.lib
-lcudalib
-lcudalib.lib
-llibcudalib
-llibcudalib.lib
it says it does not find
for
-L. cudalib.lib
-L. -lcudalib
it says undefined reference to 'call'
verbose:
D:\-=OWN=-\-=PROJECTS=-\CUDA\Tutoring\examples\cuda_example_01_(1krncu_1callercu
_1cudah_1mainc)_libformat>mingw32-make -f MakeRunnable
gcc -v main.o -L. -lcudalib -o app
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=c:/mingw/bin/../libexec/gcc/mingw32/4.5.2/lto-wrapper.exe
Target: mingw32
Configured with: ../gcc-4.5.2/configure --enable-languages=c,c++,ada,fortran,obj
c,obj-c++ --disable-sjlj-exceptions --with-dwarf2 --enable-shared --enable-libgo
mp --disable-win32-registry --enable-libstdcxx-debug --enable-version-specific-r
untime-libs --disable-werror --build=mingw32 --prefix=/mingw
Thread model: win32
gcc version 4.5.2 (GCC)
COMPILER_PATH=c:/mingw/bin/../libexec/gcc/mingw32/4.5.2/;c:/mingw/bin/../libexec
/gcc/;c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/bin/
LIBRARY_PATH=c:/mingw/bin/../lib/gcc/mingw32/4.5.2/;c:/mingw/bin/../lib/gcc/;c:/
mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/lib/;c:/mingw/bin/../lib/
gcc/mingw32/4.5.2/../../../
COLLECT_GCC_OPTIONS='-v' '-L.' '-o' 'app.exe' '-mtune=i386' '-march=i386'
c:/mingw/bin/../libexec/gcc/mingw32/4.5.2/collect2.exe -Bdynamic -o app.exe c:/
mingw/bin/../lib/gcc/mingw32/4.5.2/../../../crt2.o c:/mingw/bin/../lib/gcc/mingw
32/4.5.2/crtbegin.o -L. -Lc:/mingw/bin/../lib/gcc/mingw32/4.5.2 -Lc:/mingw/bin/.
./lib/gcc -Lc:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/lib -Lc:/m
ingw/bin/../lib/gcc/mingw32/4.5.2/../../.. main.o -lcudalib -lmingw32 -lgcc_eh -
lgcc -lmoldname -lmingwex -lmsvcrt -luser32 -lkernel32 -ladvapi32 -lshell32 -lmi
ngw32 -lgcc_eh -lgcc -lmoldname -lmingwex -lmsvcrt c:/mingw/bin/../lib/gcc/mingw
32/4.5.2/crtend.o
main.o:main.c:(.text+0xc): undefined reference to `call'
collect2: ld returned 1 exit status
mingw32-make: *** [app] Error 1
D:\-=OWN=-\-=PROJECTS=-\CUDA\Tutoring\examples\cuda_example_01_(1krncu_1callercu
_1cudah_1mainc)_libformat>PAUSE
Press any key to continue . . .
Hmm, that did not go very well... You should try -static. And there is no name conflict right? I.e. no other lib named cudalib. You can probably try a static lib tutorial that does not involve cuda first... Just to make sure nvcc is not doing anything funny here, and also to confirm the gcc params. Other than that, not sure how to fix this.
ReplyDeleteI went through these tutorials and descriptions and created examples for me without cuda and everything worked well:
ReplyDeletehttp://www.adp-gmbh.ch/cpp/gcc/create_lib.html
http://www.codeproject.com/KB/cpp/libraries.aspx
http://www.mingw.org/wiki/Specify_the_libraries_for_the_linker_to_use
I also checked, there is no conflict. :(
Anyway, thank you for trying to help me.
This comment has been removed by the author.
DeleteHi Friend I create mainServer.cpp_o and PIRMatrixMul.cu_o but problem is that when I execute the make file of server I get error of /usr/bin/ld:cutil.h cannot find please tell me the solution I m going to paste my make file. And when M trying to run my make file (not considering two object files) I get an error of CUT_INIT_DEVICE requires two arguments
Deletemake file is
Creating server Makefile
ReplyDeleteExecuting 'make' for the server...
/usr/bin/ld: 1: /bin: Permission denied
/usr/bin/ld: 2: Makefile: not found
/usr/bin/ld: 3: Makefile: not found
/usr/bin/ld: 4: Syntax error: "(" unexpected
collect2: ld returned 2 exit status
make: *** [PIRServer] Error 1
how to resolve this query?
hi freinds,
ReplyDeleteI have file called mulshare.cu and mulshare.h.
In mulshare.cu I have used arc4random function.while compiling with nvcc compiler I am getting error:
error: identifier "arc4random" is undefined.
here is the part of code.
int main(int argc, char* argv[])
{
Matrix A,B,C;
int a1,a2,b1,b2;
a1 = atoi(argv[1]); /* height of A */
a2 = atoi(argv[2]); /* width of A */
b1 = a2; /* height of B */
b2 = atoi(argv[3]); /* Width of B */
A.height = a1;
A.width = a2;
A.elements = (float*)malloc(A.width * A.height * sizeof(float));
B.height = b1;
B.width = b2;
B.elements = (float*)malloc(B.width * B.height * sizeof(float));
C.height = A.height;
C.width = B.width;
C.elements = (float*)malloc(C.width * C.height * sizeof(float));
for(int i = 0; i < A.height; i++)
for(int j = 0; j < A.width; j++)
A.elements[i*A.width + j] = (arc4random() % 3);
for(int i = 0; i < B.height; i++)
for(int j = 0; j < B.width; j++)
B.elements[i*B.width + j] = (arc4random() % 2);
MatMul(A,B,C);
for(int i=0; i < min(10,A.height);i++)
{
for(int j=0; j < min(10,A.width);j++)
printf("%f",A.elements[i*A.width + j]);
printf("\n");
}
printf("\n");
for(int i=0; i < min(10,B.height);i++)
{
for(int j=0; j < min(10,B.width);j++)
printf("%f",B.elements[i*B.width + j]);
printf("\n");
}
printf("\n");
for(int i=0; i < min(10,C.height);i++)
{
for(int j=0; j < min(10,C.width);j++)
printf("%f",C.elements[i*C.width + j]);
printf("\n");
}
printf("\n");
}
I am trying to compile with nvcc as well as gcc.but still error is coming.
I am using nvcc mulshare.cu.
Please,help me to get out of here.
arc4random does not seem to be a standard C/C++ function.
ReplyDeleteAre you including any external libraries?
The error you are getting points to not including the correct headers.
You will need to include whatever headers are needed for arc4random, and tell nvcc where these headers and libs are (-I and -L)
If arc4random works separately in a regular c compiler (gcc) you could try to split your files into one cpp file and one cu file.
Put all external function calls like arc4random in the cpp file, and compile the cpp file directly using gcc (don't link. use gcc -c)
Then compile the cu file using nvcc and link the output binaries as described in my original post.