about summary refs log tree commit diff
path: root/nixpkgs/pkgs/development/compilers/llvm/3.8/D17533-1.patch
blob: 79ca953d6e5b0ce56e2448ee13763f5f97480280 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
commit eb92f5a745014532b83abfba04602fce87ca8393
Author: Chuang-Yu Cheng <cycheng@multicorewareinc.com>
Date:   Fri Apr 8 12:04:32 2016 +0000

    CXX_FAST_TLS calling convention: performance improvement for PPC64
    
    This is the same change on PPC64 as r255821 on AArch64. I have even borrowed
    his commit message.
    
    The access function has a short entry and a short exit, the initialization
    block is only run the first time. To improve the performance, we want to
    have a short frame at the entry and exit.
    
    We explicitly handle most of the CSRs via copies. Only the CSRs that are not
    handled via copies will be in CSR_SaveList.
    
    Frame lowering and prologue/epilogue insertion will generate a short frame
    in the entry and exit according to CSR_SaveList. The majority of the CSRs will
    be handled by register allcoator. Register allocator will try to spill and
    reload them in the initialization block.
    
    We add CSRsViaCopy, it will be explicitly handled during lowering.
    
    1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
       supports it for the given machine function and the function has only return
       exits). We also call TLI->initializeSplitCSR to perform initialization.
    2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
       virtual registers at beginning of the entry block and copies from virtual
       registers to CSRsViaCopy at beginning of the exit blocks.
    3> we also need to make sure the explicit copies will not be eliminated.
    
    Author: Tom Jablin (tjablin)
    Reviewers: hfinkel kbarton cycheng
    
    http://reviews.llvm.org/D17533
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265781 91177308-0d34-0410-b5e6-96231b3b80d8

diff --git a/lib/CodeGen/TargetFrameLoweringImpl.cpp b/lib/CodeGen/TargetFrameLoweringImpl.cpp
index 679ade1..0a0e079 100644
--- a/lib/CodeGen/TargetFrameLoweringImpl.cpp
+++ b/lib/CodeGen/TargetFrameLoweringImpl.cpp
@@ -63,12 +63,15 @@ void TargetFrameLowering::determineCalleeSaves(MachineFunction &MF,
   const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
   const MCPhysReg *CSRegs = TRI.getCalleeSavedRegs(&MF);
 
+  // Resize before the early returns. Some backends expect that
+  // SavedRegs.size() == TRI.getNumRegs() after this call even if there are no
+  // saved registers.
+  SavedRegs.resize(TRI.getNumRegs());
+
   // Early exit if there are no callee saved registers.
   if (!CSRegs || CSRegs[0] == 0)
     return;
 
-  SavedRegs.resize(TRI.getNumRegs());
-
   // In Naked functions we aren't going to save any registers.
   if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
     return;