-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2 issues here #277
Open
eric1hello
wants to merge
224
commits into
gpgpu-sim:master
Choose a base branch
from
accel-sim:dev
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
2 issues here #277
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix for bugs in lazy write handling
added MSHR_HIT
change address type into ull
do not truncate 32 MSB bits of the memory address
bug fix was_writeback_sent()
Fix cache hash function and renaming
shared mem bank conflicts
Fixed an outdated comment line
Update gpgpusim.config
Adding GitHub CI
rename ci tests
Adding a non-zero return on error
* migrate_cmake: add package dependency checking * migrate_cmake: port setup_environment to CMake * migrate_cmake: break dependency checking and env export gen to different .cmake files * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: use CUDAToolkit_FOUND to test for CUDA compiler * migrate_cmake: properly parse for cuda version number * migrate_cmake: set highest CUDA supported to be 11.10.x * migrate_cmake: specify top level CMake file * migrate_cmake: add libcuda cmake file * migrate_cmake: use global compiler options and definitions * migrate_cmake: add cmake file to src * migrate_cmake: add cmake files for cuda-sim folder * migrate_cmake: add cmake files to gpgpu-sim folder * migrate_cmake: add cmake files for intersim * migrate_cmake: add short test using cmake * migrate_cmake: bump CXX standard requirement to 17 * Add cmake files for accelwattch * migrate_cmake: remove use of GLOB to grab source files * migrate_cmake: comment out the write protection on generated instructions.h * migrate_cmake: create sym folder and add newline to generated setup file * migrate_cmake: fix some path issues * migrate_cmake: let cmake thinks flex and bison generate CXX files * migrate_cmake: fix not linking pthread properly * migrate_cmake: remove debug message * migrate_cmake: add empty libopencl cmake file * migrate_cmake: install phase and runtime version detect * Added install phase to install the shared object and add symlinks * Changes with CUDA toolkit will be detected and triggered a rebuild * GPGPU-Sim detailed version string will be updated on each build * Typo fix and fix correct bin dir * Replace gcc -> g++ in intersim * ignore setup * check CMAKE_BUILD_TYPE * set DCMAKE_BUILD_TYPE --------- Co-authored-by: JRPAN <[email protected]>
CMAKE_BUILD_TYPE should be inside ${}
LDGSTS/LDGDEPBAR was introduced #62, but it's increment part was deleted by mistake. So add it. In some applications, ldgsts may not exist between ldgdepbar. In such cases, add exception handling logic to insert an empty vector. Reported-by: Okkyun Woo <[email protected]> Signed-off-by: Wonhyuk Yang <[email protected]>
* remove implicit casting, cleanup unused bank_warp_shift parameter * update cu init function prototype * remove m_bank_warp_shift from function call
* add automated clang formatter * Automated clang-format * use /bin/bash and add print * use default checkout ref * Format only after tests are success * Run CI on merge group --------- Co-authored-by: barnes88 <[email protected]> Co-authored-by: JRPAN <[email protected]>
* run formatter only on PR * remove unused & unintilized variables * fix signed & unsigned comparison warning * enable merge queue * resolve conflict * in formatter, checkout the forked repo, not the base repo in PR * Try to use jenkins for formatter * Automated Format --------- Co-authored-by: purdue-jenkins <[email protected]>
* Temp commit for Justin and Cassie to sync on code changes for adding per-stream status. * Resolved compile errors. * Removed redundant parameter * Passed cuda_stream_id from accelsim to gpgpusim * Cleaned up unused changes * Changed vector to map, having operator problems. * StreamID defaults to zero * Implemented streams to inc_stats and so on * Fixed TOTAL_ACCESS counts * Implemented GLOBAL_TIMER. * Fixed m_shader->get_kernel SEGFAULT issue in shader.cc. * Use warp_init to track streamID instead of issue_warp * Removed temp debug print * Modified cache_stats to only print data from latest finished stream Added optional arg to cache_stats::print_stats, cache_stats::print_fail_stats and their upstream functions. When streamID is specified, print stats from that stream. When not specified, print all stats. NOTE: current implementation depending on streamid never equals -1 * Removed default arg values of streamID * modified constructor of mem_fetch to pass in streamID * changed get_streamid to get_streamID * Added TODO to gpgpusim_entrypoint.cc and power_stat.cc * Only collect power stats when enabled * print last finished stream in PTX mode using last_streamID * take out additional printf * Add a field to baseline cache to indicate cache level * save gpu object in cache * Print stream ID only once per kernel * rm test print * use -1 for default stream id * cleanup debug prints * remove GLOABL_TIMER * Automated clang-format * Should be correct to print everything in power model * addressing concerns & errors * Automated clang-format * add m_stats_pw in operator+ * Automated Format --------- Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Justin Qiao <[email protected]> Co-authored-by: Tim Rogers <[email protected]> Co-authored-by: JRPan <[email protected]> Co-authored-by: purdue-jenkins <[email protected]>
* we have gcc-11 now. Check version for more than 2 digits. * version detection as well - And support c++ 11 by default
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I found 2 issues here:
---- shader.cc
I think the pI1 shall be pI2.
if ((pI1->oprnd_type == INT_OP) || (pI1->oprnd_type == UN_OP)) { //these counters get added up in mcPat to compute scheduler power
m_stats->m_num_INTdecoded_insn[m_sid]++;
---- I tried to enable cooperative_groups in bellow cuda, but seem it doesn't work , something issue with PTX, do you know the reasons?
device int atomicAggInc(int *ptr) {
auto g = cg::coalesced_threads();
int prev;
if (g.thread_rank() == 0)
prev = atomicAdd(ptr,g.size());
}
global void vectorAdd(float *A, const float *B, float *C , int numElements) {
int i = blockDim.x * blockIdx.x + threadIdx.x;
//if (i < numElements) {
// C[i] = A[i] + B[i];
//}
if ( i%10 == 0){
int rankIdx = atomicAggInc(&count);
printf ("blockIdx = %d, threadIdx = %d, rank = %d \n",blockIdx.x ,threadIdx.x,rankIdx);
}
}