浏览量:1,225

OGRE内存分配策略

本文介绍了OGRE渲染引擎中的内存分配器,前言介绍了OGRE版本、编译环境、工具等信息,接着介绍了OGRE渲染引擎中的内存分配和策略的设计、实现。

一. 前言

分析的ogre源代码版本是

2013-12-30-fig1

操作系统:windows 7 x64

编译工具:CMake

编译器:Visual Studio 2008

编译完成后会生成一个OgreBuildSettings.h文件,包括一些用户自定义的信息,举个简单的例子,下面的几个宏表示编译源码,使它支持D3D9渲染和OpenGL渲染功能,编译的插件有BSP、OCTREE、PCZ、PFX、CG,使用小端字节序。

#define OGRE_BUILD_RENDERSYSTEM_D3D9
/* #undef OGRE_BUILD_RENDERSYSTEM_D3D11 */
#define OGRE_BUILD_RENDERSYSTEM_GL
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES */
/* #undef OGRE_BUILD_RENDERSYSTEM_GLES2 */
#define OGRE_BUILD_PLUGIN_BSP
#define OGRE_BUILD_PLUGIN_OCTREE
#define OGRE_BUILD_PLUGIN_PCZ
#define OGRE_BUILD_PLUGIN_PFX
#define OGRE_BUILD_PLUGIN_CG

OGRE的配置信息,包括分配器的选择、是否提供多线程支持、系统和编译器相关的一些宏定义等,一般保存在下面的几个头文件中。

OgreBuildSettings.h
OgreConfig.h
OgrePlatform.h
OgreStdHeaders.h
OgreStableHeaders.h

OgrePrerequisites.h

二. Ogre内存分配

2.1 主要的文件

Ogre渲染引擎与内存管理相关的几个文件有:

OgreAlignedAllocator.h
OgreAlignedAllocator.cpp
OgreMemoryAllocatedObject.h
OgreMemoryAllocatorConfig.h
OgreMemoryNedAlloc.h        //使用nedmalloc内存分配器,未使用内存池
OgreMemoryNedAlloc.cpp
OgreMemoryNedPooling.h      //使用nedmalloc内存分配器,使用内存池,是OGRE默认的分配器
OgreMemoryNedPooling.cpp
OgreMemoryStdAlloc.h        //系统自带的内存分配器
OgreMemorySTLAllocator.h
OgreMemoryTracker.h         //用于追踪内存分配和释放,例如记录是否发生内存泄漏等问题

OgreMemoryTracker.cpp

2.2 OGRE的三种内存分配方法和实现

OGRE定义了4种内存分配方法,STD表示系统自带的内存分配器,NED和NEDPOOLING都使用到了nedmalloc内存分配器(对内存分配器的介绍,参见文章《内存分配器浅谈》),USER表示用户自定义的内存分配器,源码中为空。

// define the memory allocator configuration to use
#define OGRE_MEMORY_ALLOCATOR_STD 1
#define OGRE_MEMORY_ALLOCATOR_NED 2
#define OGRE_MEMORY_ALLOCATOR_USER 3
#define OGRE_MEMORY_ALLOCATOR_NEDPOOLING 4

OGRE通过宏OGRE_MEMORY_ALLOCATOR来选择分配器的类型,默认情况下它的值是4,选择nedmalloc分配器的内存池的方法。

三种不同的内存分配器都需要实现接口相同但名字不同的类,基本结构如下所示:

class _OgreExport AllocateNamePolicy
{
public:
	static inline void* allocateBytes(size_t count, 
		const char* file = 0, int line = 0, const char* func = 0);
	static inline void deallocateBytes(void* ptr);
	/// Get the maximum size of a single allocation
	static inline size_t getMaxAllocationSize();

private:
	// No instantiation
	AllocateNamePolicy ();
};

如果采用STD内存分配器,则实现的分配策略类名是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreAlignedAllocator.h,OgreAlignedAllocator.cpp和OgreMemoryStdAlloc.h。

如果采用NED,则是NedAllocPolicy和NedAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h    和OgreMemoryNedAlloc.cpp中。

如果采用NEDPOOLING,则是StdAllocPolicy和StdAlignedAllocPolicy,类的实现在文件OgreMemoryNedAlloc.h和OgreMemoryNedAlloc.cpp中。

2.3 OGRE的内存分配策略

内存分配策略,指的是一种设计模式,将自定义的分存分配器应用到对象内存的分配和释放上。

接下来,以系统自带的内存分配器,来说明OGRE的内存分配策略,此时OGRE_MEMORY_ALLOCATOR的值为1。

分配的内存共有两类:非内存对齐的和内存对齐的。而所谓的内存对齐的解释,可以参考【2】【3】两篇文章,有非常详细的介绍。

OGRE中非内存对齐的内存分配代码,如下所示,#if OGRE_MEMORY_TRACKER到#endif之间的内容与内存追踪器相关,不影响内存的分配和释放。

class _OgreExport StdAllocPolicy
{
public:
	static inline void* allocateBytes(size_t count, 
#if OGRE_MEMORY_TRACKER
		const char* file = 0, int line = 0, const char* func = 0
#else
		const char*  = 0, int  = 0, const char* = 0
#endif
		)
	{
		void* ptr = malloc(count);
#if OGRE_MEMORY_TRACKER
		// this alloc policy doesn't do pools
		MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
		return ptr;
	}
	static inline void deallocateBytes(void* ptr)
	{
#if OGRE_MEMORY_TRACKER
		MemoryTracker::get()._recordDealloc(ptr);
#endif
		free(ptr);
	}

	/// Get the maximum size of a single allocation
	static inline size_t getMaxAllocationSize()
	{
		return std::numeric_limits<size_t>::max();
	}
private:
	// no instantiation
	StdAllocPolicy()
	{ }
};

内存对齐的分配代码如下所示:

template <size_t Alignment = 0>
class StdAlignedAllocPolicy
{
public:
	// compile-time check alignment is available.
	typedef int IsValidAlignment
		[Alignment <= 128 && ((Alignment & (Alignment-1)) == 0) ? +1 : -1];

	static inline void* allocateBytes(size_t count, 
#if OGRE_MEMORY_TRACKER
		const char* file = 0, int line = 0, const char* func = 0
#else
		const char*  = 0, int  = 0, const char* = 0
#endif
		)
	{
		void* ptr = Alignment ? AlignedMemory::allocate(count, Alignment)
			: AlignedMemory::allocate(count);
#if OGRE_MEMORY_TRACKER
		// this alloc policy doesn't do pools
		MemoryTracker::get()._recordAlloc(ptr, count, 0, file, line, func);
#endif
		return ptr;
	}

	static inline void deallocateBytes(void* ptr)
	{
#if OGRE_MEMORY_TRACKER
		MemoryTracker::get()._recordDealloc(ptr);
#endif
		AlignedMemory::deallocate(ptr);
	}

	/// Get the maximum size of a single allocation
	static inline size_t getMaxAllocationSize()
	{
		return std::numeric_limits<size_t>::max();
	}
private:
	// No instantiation
	StdAlignedAllocPolicy()
	{ }
};

使用系统自带的分配器,实现对齐内存的分配,参考类AlignedMemory,代码实现参见文件AlignedMemory.h和AlignedMemory.cpp,主要的代码如下所示:

/** Class to provide aligned memory allocate functionality.
@remarks
    All SIMD processing are friendly with aligned memory, and some SIMD routines
    are designed for working with aligned memory only. If the data are intended to
    use SIMD processing, it's need to be aligned for better performance boost.
    In additional, most time cache boundary aligned data also lead to better
    performance even if didn't used SIMD processing. So this class provides a couple
    of functions for allocate aligned memory.
@par
    Anyways, in general, you don't need to use this class directly, Ogre internally
    will take care with most SIMD and cache friendly optimisation if possible.
@par
    This isn't a "one-step" optimisation, there are a lot of underlying work to
    achieve performance boost. If you didn't know what are you doing or what there
    are going, just ignore this class.
@note
    This class intended to use by advanced user only.
*/
class _OgreExport AlignedMemory
{
public:
    /** Allocate memory with given alignment.
        @param
            size The size of memory need to allocate.
        @param
            alignment The alignment of result pointer, must be power of two
            and in range [1, 128].
        @return
            The allocated memory pointer.
        @par
            On failure, exception will be throw.
    */
    static void* allocate(size_t size, size_t alignment);

    /** Allocate memory with default platform dependent alignment.
        @remarks
            The default alignment depend on target machine, this function
            guarantee aligned memory according with SIMD processing and
            cache boundary friendly.
        @param
            size The size of memory need to allocate.
        @return
            The allocated memory pointer.
        @par
            On failure, exception will be throw.
    */
    static void* allocate(size_t size);

    /** Deallocate memory that allocated by this class.
        @param
            p Pointer to the memory allocated by this class or <b>NULL</b> pointer.
        @par
            On <b>NULL</b> pointer, nothing happen.
    */
    static void deallocate(void* p);
};
void* AlignedMemory::allocate(size_t size, size_t alignment)
{
    assert(0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));

    unsigned char* p = new unsigned char[size + alignment];
    size_t offset = alignment - (size_t(p) & (alignment-1));

    unsigned char* result = p + offset;
    result[-1] = (unsigned char)offset;

    return result;
}
//---------------------------------------------------------------------
void* AlignedMemory::allocate(size_t size)
{
    return allocate(size, OGRE_SIMD_ALIGNMENT);
}
//---------------------------------------------------------------------
void AlignedMemory::deallocate(void* p)
{
    if (p)
    {
        unsigned char* mem = (unsigned char*)p;
        mem = mem - mem[-1];
        delete [] mem;
    }
}

深入分析对齐内存的分配,函数void* AlignedMemory::allocate(size_t size, size_t alignment)分配内存,alignment规定了对齐的字节数,需要在[1,128]范围内且是2的幂。

以4字节对齐为例,分配100字节的大小,即有size = 100,alignment = 4,代码分析如下所示:

    assert(0 < alignment && alignment <= 128 && Bitwise::isPO2(alignment));
		// alignment=4满足条件,程序继续进行
unsigned char* p = new unsigned char[size + alignment];
	//假设返回的地址p = 0x006b8f33
    size_t offset = alignment - (size_t(p) & (alignment-1));
		//此时offset = 1,不管p的值是什么,offset的取值范围是[1,4]
unsigned char* result = p + offset;
	//返回的指针result = 0x006b8f34,是4字节对齐的
    result[-1] = (unsigned char)offset;
		//把offset保存在地址0x006b8f33所在的字节中,因为在内存释放时,需要使用到内存分配的原始地址0x006b8f33
return result;
	//返回结果,算法结束

在理解了对齐内存的分配方法,对该内存的释放代码就容易理解了。

OGRE使用模板类来重载new、new[]、delete、delete[]等内存分配和释放的方法,如下面的代码所示,其中特别需要注意到的是placement new和placement delete,对C++中new和delete的特殊用法,参考【1】中的解释。

/** Superclass for all objects that wish to use custom memory allocators
	when their new / delete operators are called.
	Requires a template parameter identifying the memory allocator policy 
	to use (e.g. see StdAllocPolicy). 
*/
template <class Alloc>
class _OgreExport AllocatedObject
{
public:
	explicit AllocatedObject()
	{ }

	~AllocatedObject()
	{ }

	/// operator new, with debug line info
	void* operator new(size_t sz, const char* file, int line, const char* func)
	{
		return Alloc::allocateBytes(sz, file, line, func);
	}

	void* operator new(size_t sz)
	{
		return Alloc::allocateBytes(sz);
	}

	/// placement operator new
	void* operator new(size_t sz, void* ptr)
	{
		(void) sz;
		return ptr;
	}

	/// array operator new, with debug line info
	void* operator new[] ( size_t sz, const char* file, int line, const char* func )
	{
		return Alloc::allocateBytes(sz, file, line, func);
	}

	void* operator new[] ( size_t sz )
	{
		return Alloc::allocateBytes(sz);
	}

	void operator delete( void* ptr )
	{
		Alloc::deallocateBytes(ptr);
	}

	// Corresponding operator for placement delete (second param same as the first)
	void operator delete( void* ptr, void* )
	{
		Alloc::deallocateBytes(ptr);
	}

	// only called if there is an exception in corresponding 'new'
	void operator delete( void* ptr, const char* , int , const char*  )
	{
		Alloc::deallocateBytes(ptr);
	}

	void operator delete[] ( void* ptr )
	{
		Alloc::deallocateBytes(ptr);
	}

	void operator delete[] ( void* ptr, const char* , int , const char*  )
	{
		Alloc::deallocateBytes(ptr);
	}
};

至此,OGRE内存分配的各个小模块都解释完了,接下来要把它们串起来。举个例子,如果使用STD作为内存分配器,来实现内存分配,使用new和delete来分配内存,就会调用重载的new和delete函数,如下面的代码片段所示。

	AllocatedObject<StdAllocPolicy>* p1 = new AllocatedObject<StdAllocPolicy>;
	AllocatedObject< StdAlignedAllocPolicy<4> >* p2 = new AllocatedObject< StdAlignedAllocPolicy<4> >;
	delete p1;
	delete p2;

如果使用前面已定义的内存分配器来分配内存类,举个例子容易说明,如下面所示。

using namespace Ogre;
namespace Ogre
{
	class Node:public AllocatedObject<StdAllocPolicy>
	{
	public:
		Node()
		{
			printf("Node Constructor\n");
		}
		~Node()
		{
			printf("Node Destructor\n");
		}
	};
}

int main( )
{
	Node* p = new Node;
	delete p;
	return 0;
}

输出结果是

Node Constructor

Node Destructor

代码的执行路径如下所示:

void* AllocatedObject::operator new(size_t sz);
void* StdAllocPolicy::allocateBytes(size_t count,const char* file = 0, int line = 0, const char* func = 0;);
Node::Node();
void AllocatedObject::operator delete( void* ptr );
void StdAllocPolicy::deallocateBytes(void* ptr);
Node::~Node();

对内存分配策略进行总结,如下图所示,其中Alloc是任意实现了下面三个接口的类。

static inline void* allocateBytes(size_t count, 
		const char* file = 0, int line = 0, const char* func = 0);
	static inline void deallocateBytes(void* ptr);
	/// Get the maximum size of a single allocation
	static inline size_t getMaxAllocationSize();

2013-12-30 23-25-09

2.4 OGRE的内存分配相关的设置

OGRE定义一个枚举结构,用来规定内存分配的目的,枚举的结构如下所示:

enum MemoryCategory
{
	/// General purpose
	MEMCATEGORY_GENERAL = 0,
	/// Geometry held in main memory
	MEMCATEGORY_GEOMETRY = 1, 
	/// Animation data like tracks, bone matrices
	MEMCATEGORY_ANIMATION = 2, 
	/// Nodes, control data
	MEMCATEGORY_SCENE_CONTROL = 3,
	/// Scene object instances
	MEMCATEGORY_SCENE_OBJECTS = 4,
	/// Other resources
	MEMCATEGORY_RESOURCE = 5,
	/// Scripting
	MEMCATEGORY_SCRIPTING = 6,
	/// Rendersystem structures
	MEMCATEGORY_RENDERSYS = 7,
	// sentinel value, do not use 
	MEMCATEGORY_COUNT = 8
};

如2.2小节所示,OGRE中三种不同的内存分配方法的实现类的命名都不同,这里就把它们都统一为一个类名。

#if OGRE_MEMORY_ALLOCATOR == OGRE_MEMORY_ALLOCATOR_NEDPOOLING
namespace Ogre
{
	template <MemoryCategory Cat> class CategorisedAllocPolicy : public NedPoolingPolicy{};
	template <MemoryCategory Cat, size_t align = 0> class CategorisedAlignAllocPolicy : public NedPoolingAlignedPolicy<align>{};
}
#elif OGRE_MEMORY_ALLOCATOR == OGRE_MEMORY_ALLOCATOR_NED
namespace Ogre
{	template <MemoryCategory Cat> class CategorisedAllocPolicy : public NedAllocPolicy{};
	template <MemoryCategory Cat, size_t align = 0> class CategorisedAlignAllocPolicy : public NedAlignedAllocPolicy<align>{};
}
#elif OGRE_MEMORY_ALLOCATOR == OGRE_MEMORY_ALLOCATOR_STD
namespace Ogre
{
	template <MemoryCategory Cat> class CategorisedAllocPolicy : public StdAllocPolicy{};
	template <MemoryCategory Cat, size_t align = 0> class CategorisedAlignAllocPolicy : public StdAlignedAllocPolicy<align>{};
}
#else
// your allocators here?
#endif

我觉得下面这些代码就是服务于代码的可读性

// Useful shortcuts
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_GENERAL> GeneralAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_GEOMETRY> GeometryAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_ANIMATION> AnimationAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_SCENE_CONTROL> SceneCtlAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_SCENE_OBJECTS> SceneObjAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_RESOURCE> ResourceAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_SCRIPTING> ScriptingAllocPolicy;
typedef CategorisedAllocPolicy<Ogre::MEMCATEGORY_RENDERSYS> RenderSysAllocPolicy;

// Now define all the base classes for each allocation
typedef AllocatedObject<GeneralAllocPolicy> GeneralAllocatedObject;
typedef AllocatedObject<GeometryAllocPolicy> GeometryAllocatedObject;
typedef AllocatedObject<AnimationAllocPolicy> AnimationAllocatedObject;
typedef AllocatedObject<SceneCtlAllocPolicy> SceneCtlAllocatedObject;
typedef AllocatedObject<SceneObjAllocPolicy> SceneObjAllocatedObject;
typedef AllocatedObject<ResourceAllocPolicy> ResourceAllocatedObject;
typedef AllocatedObject<ScriptingAllocPolicy> ScriptingAllocatedObject;
typedef AllocatedObject<RenderSysAllocPolicy> RenderSysAllocatedObject;

typedef ScriptingAllocatedObject	AbstractNodeAlloc;
typedef AnimationAllocatedObject	AnimableAlloc;
typedef AnimationAllocatedObject	AnimationAlloc;
typedef GeneralAllocatedObject		ArchiveAlloc;
.....

此外还有一些内存分配和释放的宏的定义,具体参见文件OgreMemoryAllocatorConfig.h。

三. 参考

【1】       http://en.wikipedia.org/wiki/Placement_syntax

【2】       http://www.cppblog.com/snailcong/archive/2009/03/16/76705.html

【3】       https://www.ibm.com/developerworks/library/pa-dalign/

 

 

 

spacer

Leave a reply