###Chapter 30 Plugging in the Grid插入电网 > And if we are able thus to attack an inferior force with a superior one, our opponents will be in dire straits.—Sun Tzu > 如果我们能够以强攻弱,对手就会陷入困境。——《孙子》 In Chapter 28, Clouds and Grids, we covered the basics of grid computing. In thischapter, we will cover in more detail the pros and cons of grid computing as well aswhere such computing infrastructure could fit in different companies. Whether youare a Web 2.0, Fortune 500, or Enterprise Software company, it is likely that youhave a need for grid computing in your scalability toolset. This chapter will provideyou with a framework for further understanding a grid computing infrastructure aswell as some ideas of where in your organization to deploy it. Grid computing offersthe scaling on demand of computing cycles for computationally intense applicationsor programs. By understanding the benefits and cons of grid computing and provid-ing you with some ideas on how this type of technology might be used, you should bewell armed to use this knowledge in your scalability efforts. 在第 28 章“云和网格”中,我们介绍了网格计算的基础知识。在本章中,我们将更详细地介绍网格计算的优点和缺点,以及这种计算基础设施适合不同公司的地方。无论您是 Web 2.0、财富 500 强公司还是企业软件公司,您的可扩展性工具集中都可能需要网格计算。本章将为您提供一个进一步理解网格计算基础设施的框架,以及在您的组织中部署它的一些想法。网格计算为计算密集型应用程序或程序提供了计算周期的按需扩展。通过了解网格计算的优点和缺点并为您提供有关如何使用此类技术的一些想法,您应该做好准备在可扩展性工作中使用这些知识。 As a way of a refresher, we defined grid computing in Chapter 28 as the term usedto describe the use of two or more computers processing individual parts of an overalltask. Tasks that are best structured for grid computing are ones that are computation-ally intensive and divisible, meaning able to be broken into smaller tasks. Software isused to orchestrate the separation of tasks, monitor the computation of these tasks,and then aggregate the completed tasks. This is parallel processing on a network dis-tributed basis instead of inside a single machine. Before grid computing, mainframeswere the only way to achieve this scale of parallel processing. Today’s grids are oftencomposed of thousands of nodes spread across networks such as the Internet. 作为复习的一种方式,我们在第 28 章中将网格计算定义为用于描述使用两台或更多计算机处理整体任务的各个部分的术语。最适合网格计算构建的任务是计算密集型且可分割的任务,这意味着能够分解为更小的任务。软件用于协调任务的分离,监控这些任务的计算,然后聚合已完成的任务。这是基于网络分布式的并行处理,而不是在单个机器内进行。在网格计算出现之前,大型机是实现这种规模的并行处理的唯一方法。当今的网格通常由分布在互联网等网络上的数千个节点组成。 Why would we consider grid computing as a principle, architecture, or aid to anorganization’s scalability? The reason is that grid computing allows for the use of sig-nificant computational resources by an application in order to process quicker orsolve problems faster. Dividing processing is a core component to scaling, think of thex-, y-, and z-axes splits in the AKF Scale Cubes. Depending on how the separation ofprocessing is done or viewed, the splitting of the application for grid computingmight take the shape or one or more of the axes. 为什么我们将网格计算视为一种原则、架构或对组织可扩展性的帮助?原因是网格计算允许应用程序使用大量计算资源,以便更快地处理或更快地解决问题。分割处理是缩放的核心组件,请考虑 AKF Scale Cube 中的 x、y 和 z 轴分割。根据处理分离的完成或查看方式,网格计算应用程序的分割可能采用一个或多个轴的形状。 ####Pros and Cons of Grids 网格的优点和缺点 Grid environments are ideal for applications that need computationally intensiveenvironments and for applications that can be divisible into elements that can besimultaneously executed. With that as a basis, we are going to discuss the benefitsand drawbacks of grid computing environments. The pros and cons are going to mat-ter differently to different organizations. If your application can be divided easily,either by luck or design, you might not care that the only way to achieve great bene-fits is with applications that can be divided. However, if you have a monolithic appli-cation, this drawback may be so significant as to completely discount the use of agrid environment. As we discuss each of the pros and cons, this fact should be kept inmind that some of each will matter more or less to your technology organization. 网格环境非常适合需要计算密集型环境的应用程序以及可分为可同时执行的元素的应用程序。以此为基础,我们将讨论网格计算环境的优点和缺点。对于不同的组织来说,利弊的重要性有所不同。如果您的应用程序可以轻松划分(无论是运气还是设计),您可能不会关心获得巨大收益的唯一方法是使用可划分的应用程序。然而,如果您有一个整体应用程序,这个缺点可能会非常严重,以至于完全无法使用网格环境。当我们讨论每个优点和缺点时,应该记住这一事实:其中一些优点和缺点对您的技术组织或多或少都很重要。 #####Pros of Grids 网格的优点 The pros of grid computing models include high computational rates, shared infra-structure, utilization of unused capacity, and cost. Each of these is explained in moredetail in the following sections. The ability to scale computation cycles up quickly asnecessary for processing is obviously directly applicable to scaling an application, ser-vice, or program. In terms of scalability, it is important to grow the computationalcapacity as needed but equally important is to do this efficiently and cost effectively. 网格计算模型的优点包括高计算速率、共享基础设施、未使用容量的利用和成本。以下各节将更详细地解释其中的每一个。根据处理需要快速扩展计算周期的能力显然直接适用于扩展应用程序、服务或程序。就可扩展性而言,根据需要增加计算能力很重要,但同样重要的是高效且经济高效地做到这一点。 High Computational Rates The first benefit that we want to discuss is a basicpremise of grid computing—that is, high computational rates. The grid computinginfrastructure is designed for applications that need computationally intensive envi-ronments. The combination of multiple hosts with software for dividing tasks anddata allows for the simultaneous execution of multiple tasks. The amount of parallel-ization is limited by the hosts available—the amount of division possible within theapplication and, in extreme cases, the network linking everything together. We cov-ered Amdahl’s law in Chapter 28, but it is worth repeating as this defines the upperbound of this benefit from the limitation of the application. The law was developedby Gene Amdahl in 1967 and states that the portion of a program that cannot be par-allelized will limit the total speed up from parallelization.1 This means that nonse-quential parts of a program will benefit from the parallelization, but the rest of theprogram will not. 高计算速率 我们要讨论的第一个好处是网格计算的基本前提,即高计算速率。网格计算基础设施是为需要计算密集型环境的应用程序而设计的。多个主机与用于划分任务和数据的软件的组合允许同时执行多个任务。并行化的数量受到可用主机的限制——应用程序内可能的划分数量,以及在极端情况下将所有内容连接在一起的网络。我们在第 28 章介绍了阿姆达尔定律,但值得重复一下,因为它定义了应用限制带来的好处的上限。该定律由 Gene Amdahl 于 1967 年制定,指出程序中无法并行化的部分将限制并行化的总速度。1 这意味着程序的非顺序部分将从并行化中受益,但程序的其余部分不会。 Shared Infrastructure The second benefit of grid computing is the use of sharedinfrastructure. Most applications that utilize grid computing do so either daily,weekly, or some periodic amount of time. Outside of the periods in which the com-puting infrastructure is used for grid computing purposes, it can be utilized by otherapplications or technology organizations. We will discuss the limitation of sharingthe infrastructure simultaneously in the “Cons of Grid Computing” section. Thisbenefit is focused on sharing the infrastructure sequentially. Whether a private orpublic grid, the host computers in the grid can be utilized almost continuouslyaround the clock. Of course, this requires the properly scheduling of jobs within theoverall grid system so that as one application completes its processing the next onecan begin. This also requires either applications that are flexible in the times that theyrun or applications that can be stopped in the middle of a job and delayed until thereis free capacity later in the day. If applications must run every day at 1 AM, the jobbefore it must complete prior to this or be designed to stop in the middle of the pro-cessing and restart later without losing valuable computations. For anyone familiarwith job scheduling on mainframes, this should sound a little familiar, because as wementioned earlier, the mainframe was the only way to achieve such intensive parallelprocessing before grid computing. 共享基础设施 网格计算的第二个好处是使用共享基础设施。大多数利用网格计算的应用程序每天、每周或定期进行一次计算。在计算基础设施用于网格计算目的的时期之外,它可以被其他应用程序或技术组织使用。我们将在“网格计算的缺点”部分讨论同时共享基础设施的局限性。此优势集中于按顺序共享基础设施。无论是私人电网还是公共电网,电网中的主机几乎可以全天候连续使用。当然,这需要在整个网格系统内正确调度作业,以便当一个应用程序完成其处理时,下一个应用程序就可以开始。这还要求应用程序的运行时间灵活,或者应用程序可以在作业中停止并延迟到当天晚些时候有可用容量为止。如果应用程序必须每天凌晨 1 点运行,则之前的作业必须在此之前完成,或者设计为在处理过程中停止并稍后重新启动,而不会丢失有价值的计算。对于熟悉大型机作业调度的人来说,这听起来应该有点熟悉,因为正如我们之前提到的,大型机是在网格计算之前实现如此密集的并行处理的唯一方法。 Utilization of Unused Capacity The third benefit that we see in some grid comput-ing implementations is the utilization of unused capacity. Grid computing implemen-tations vary, and some are wholly dedicated to grid computing all day, whereasothers are utilized as other types of computers during the day and connected to thegrid at night when no one is using them. For grids that are utilizing surplus capacity,this approach is known as CPU scavenging. One of the most well-known grid scav-enging programs has been SETI@home that utilizes unused CPU cycles on volunteers’computers in a search for extraterrestrial intelligence in radio telescope data. Thereare obviously drawbacks of utilizing spare capacity that include unpredictability ofthe number of hosts and the speed or capacity of each host. When dealing with largecorporate computer networks or standardized systems that are idle during theevening, these drawbacks are minimized. 未使用容量的利用 我们在某些网格计算实现中看到的第三个好处是未使用容量的利用。网格计算的实现各不相同,有些全天致力于网格计算,而另一些则在白天用作其他类型的计算机,并在晚上无人使用时连接到网格。对于利用剩余容量的电网,这种方法称为 CPU 清理。 SETI@home 是最著名的网格清理程序之一,它利用志愿者计算机上未使用的 CPU 周期在射电望远镜数据中搜索外星智慧。利用备用容量存在明显的缺点,包括主机数量以及每个主机的速度或容量的不可预测性。当处理晚上闲置的大型企业计算机网络或标准化系统时,这些缺点可以最小化。 Cost A fourth benefit that can come from grid computing is in terms of cost. Onecan realize a benefit of scaling efficiently in a grid as it takes advantage of the distrib-uted nature of applications. This can be thought of in terms of scaling the y-axis, asdiscussed in Chapter 23, Splitting Applications for Scale, and shown in Figure 23.1.As one service or particular computation has more demand placed on it, instead ofscaling the entire application or suite of services along an x-axis (horizontal duplication),you can be much more specific and scale only the service or computation thatrequires the growth. This allows you to spend much more efficiently only on thecapacity that is necessary. The other advantage in terms of cost can come from scav-enging spare cycles on desktops or other servers, as described in the previous para-graph referencing the SETI@home program. 成本 网格计算带来的第四个好处是成本。人们可以实现在网格中有效扩展的好处,因为它利用了应用程序的分布式特性。这可以从缩放 y 轴的角度来考虑,如第 23 章“拆分应用程序以实现缩放”中所讨论的,如图 23.1 所示。由于一项服务或特定计算对其有更多需求,因此无需缩放整个应用程序或套件沿 x 轴的服务(水平重复),您可以更加具体,并且仅扩展需要增长的服务或计算。这使您可以更有效地仅在必要的容量上进行支出。成本方面的另一个优势可以来自于台式机或其他服务器上的备用周期的清理,如前面引用 SETI@home 计划的段落中所述。 #####Pros of Grid Computing 网格计算的优点 We have identified three major benefits of grid computing. These are listed in no particularorder and are not all inclusive. There are many more benefits, but these are representative ofthe types of benefits you could expect from including grid computing in your infrastructure. 我们已经确定了网格计算的三个主要优点。这些未按特定顺序列出,并且并未包含所有内容。还有更多好处,但这些代表了您可以通过在基础设施中包含网格计算而获得的好处类型。 * High computation rates. With the amalgamation of multiple hosts on a network, an appli-cation can achieve very high computational rates or computational throughput. * 高计算率。通过网络上多个主机的合并,应用程序可以实现非常高的计算速率或计算吞吐量。 * Shared infrastructure. Although grids are not necessarily great infrastructure compo-nents to share with other applications simultaneously, they are generally not used around the clock and can be shared by applications sequentially. * 共享基础设施。尽管网格不一定是与其他应用程序同时共享的重要基础设施组件,但它们通常不会全天候使用,并且可以由应用程序顺序共享。 * Unused capacity. For grids that utilize unused hosts during off hours, the grid offers a great use for this untapped capacity. Personal computers are not the only untapped capacity, often testing environments are not utilized during the late evening hours and can be integrated into a grid computing system. * 未使用的容量。对于在非工作时间利用未使用的主机的网格来说,网格为这种未开发的容量提供了很好的用途。个人计算机并不是唯一未开发的能力,测试环境通常不会在深夜时段使用,并且可以集成到网格计算系统中。 * Cost. Whether the grid is scaling the specific program within your service offerings or tak-ing advantage of scavenged capacity, these are both ways to make computations more cost-effective. This is yet another reason to look at grids as scalability solutions. * 成本。无论网格是扩展服务产品中的特定程序还是利用清理的容量,这些都是使计算更具成本效益的方法。这是将网格视为可扩展性解决方案的另一个原因。 These are three of the benefits that you may see from integrating a grid computing systeminto your infrastructure. The amount of benefit that you see from any of these will depend onyour specific application and implementation. 通过将网格计算系统集成到您的基础设施中,您可能会看到以下三个好处。您从其中任何一个中看到的好处将取决于您的具体应用和实施。 #####Cons of Grids 网格的缺点 We are now going to switch from the benefits of utilizing a grid computing infra-structure and talk about the drawbacks. As with the benefits, the significance orimportance that you place on each of the drawbacks is going to be directly related tothe applications that you are considering for the grid. If your application wasdesigned to be run in parallel and is not monolithic, this drawback may be of littleconcern to you. However, if you have arrived at a grid computing architecturebecause your monolithic application has grown to where it cannot compute 24hours’ worth of data in a 24-hour time span and you must do something or else con-tinue to fall behind, this drawback may be of a grave concern to you. We will discussthree major drawbacks as we see them with grid computing. These include the difficultyin sharing the infrastructure simultaneously, the inability to work well with mono-lithic applications, and the increased complexity of utilizing these infrastructures. 我们现在将不再讨论利用网格计算基础设施的优点,而是讨论其缺点。与优点一样,您对每个缺点的重视程度也将与您正在考虑的网格应用程序直接相关。如果您的应用程序被设计为并行运行并且不是整体式的,那么您可能不会担心这个缺点。然而,如果您已经采用了网格计算架构,因为您的整体应用程序已经发展到无法在 24 小时的时间跨度内计算 24 小时的数据,并且您必须做点什么,否则会继续落后,那么这个缺点可能会出现。引起您的严重关切。我们将讨论网格计算的三个主要缺点。其中包括同时共享基础设施的困难、无法与单体应用程序良好配合以及利用这些基础设施的复杂性增加。 Not Shared Simultaneously The first con or drawback is that it is difficult if notimpossible to share the grid computing infrastructure simultaneously. Certainly, somegrids are large enough that they have enough capacity for running many applicationssimultaneously, but they really are still running in separate grid environments, withthe hosts just reallocated for a particular time period. For example, if I have a gridthat consists of 100 hosts, I could run 10 applications on 10 separate hosts each.Although you should consider this sharing the infrastructure, as we stated in the ben-efits section earlier, this is not sharing it simultaneously. Running more than oneapplication on the same host defeats the purpose of massive parallel computing thatis gained by the grid infrastructure. 不同时共享 第一个缺点是同时共享网格计算基础设施即使不是不可能也是很困难的。当然,有些网格足够大,有足够的容量来同时运行许多应用程序,但它们实际上仍然在单独的网格环境中运行,主机只是在特定时间段内重新分配。例如,如果我有一个由 100 台主机组成的网格,我可以在 10 台单独的主机上分别运行 10 个应用程序。尽管您应该考虑共享基础架构,正如我们在前面的效益部分所述,但这并不是同时共享它。在同一主机上运行多个应用程序违背了网格基础设施所实现的大规模并行计算的目的。 Grids are not great infrastructures to share with multiple tenants. You run on agrid to parallelize and increase the computational bandwidth for your application.Sharing or multitenancy can occur serially, one after the other, in a grid environmentwhere each application runs in isolation and when completed the next job runs. This typeof scheduling is common among systems that run large parallel processing infrastruc-tures that are designed to be utilized simultaneously to compute large problem sets. 网格并不是与多个租户共享的优秀基础设施。您在网格上运行,以并行化并增加应用程序的计算带宽。在网格环境中,共享或多租户可以依次发生,其中每个应用程序独立运行,完成后运行下一个作业。这种类型的调度在运行大型并行处理基础设施的系统中很常见,这些基础设施被设计为同时用于计算大型问题集。 What this means for you running an application is that you must have flexibilitybuilt into your application and system to either start and stop processing as necessaryor run at a fixed time each time period, usually daily or weekly. Because applicationsneed the infrastructure to themselves, they are often scheduled to run during certainwindows. If the application begins to exceed this window, perhaps because of moredata to process, the window must be rescheduled to accommodate this or else allother jobs in the queue will get delayed. 这对于运行应用程序来说意味着您的应用程序和系统必须具有内置的灵活性,以便根据需要启动和停止处理,或者在每个时间段(通常是每天或每周)的固定时间运行。由于应用程序本身需要基础设施,因此它们通常被安排在某些窗口运行。如果应用程序开始超过此窗口,可能是因为需要处理更多数据,则必须重新安排窗口以适应此窗口,否则队列中的所有其他作业将被延迟。 Monolithic Applications The next drawback that we see with grid computing infra-structure is that it does not work well with monolithic applications. In fact, if youcannot divide the application into parts that can be run in parallel, the grid will nothelp processing at all. The throughput of a monolithic application cannot be helpedby running on a grid. A monolithic application can be replicated onto many individualservers, as seen in an x-axis split, and the capacity can be increased by adding servers.As we stated in the discussion of Amdahl’s law, nonsequential parts of a program willbenefit from the parallelization, but the rest of the program will not. Those parts of aprogram that must run in order, sequentially, are not able to be parallelized. 整体应用程序 网格计算基础设施的下一个缺点是它不能很好地与整体应用程序配合使用。事实上,如果你不能将应用程序划分为可以并行运行的部分,那么网格根本无法帮助处理。在网格上运行并不能提高单体应用程序的吞吐量。单个应用程序可以复制到许多单独的服务器上,如 x 轴分割所示,并且可以通过添加服务器来增加容量。正如我们在阿姆达尔定律的讨论中所述,程序的非顺序部分将从并行化中受益,但是程序的其余部分不会。程序中必须按顺序运行的部分无法并行化。 Complexity The last major drawback that we see in grid computing is the increasedcomplexity of the grid. Hosting and running an application by itself is often complexenough considering the interactions that are required with users, other systems,databases, disk storage, and so on. Add to this already complex and highly volatileenvironment the need to run this on top of a grid environment and it becomes evenmore complex. The grid is not just another set of hosts. Running on a grid requires aspecialized operating system that among many other things manages which host haswhich job, what happens when a host dies in the middle of a job, what data the hostneeds to perform the task, gathering the processed results back afterward, deletingthe data from the host, and aggregating the results together. This adds a lot of com-plexity and if you have ever debugged an application that has hundreds of instancesof the same application on different servers, you can imagine the challenge of debug-ging one application running across hundreds of servers. 复杂性 我们在网格计算中看到的最后一个主要缺点是网格复杂性的增加。考虑到与用户、其他系统、数据库、磁盘存储等所需的交互,托管和运行应用程序本身通常非常复杂。再加上这个已经很复杂且高度不稳定的环境,还需要在网格环境之上运行它,它就会变得更加复杂。网格不仅仅是另一组主机。在网格上运行需要一个专门的操作系统,其中包括管理哪个主机有哪个作业、主机在作业中死机时会发生什么、主机需要哪些数据来执行任务、随后收集处理后的结果、删除数据来自主机,并将结果汇总在一起。这增加了很多复杂性,如果您曾经调试过在不同服务器上有数百个同一应用程序实例的应用程序,您可以想象调试跨数百台服务器运行的一个应用程序的挑战。 #####Cons of Grid Computing 网格计算的缺点 We have identified three major drawbacks of grid computing. These are listed in no particularorder and are not all inclusive. There are many more cons, but these are representative of whatyou should expect if you include grid computing in your infrastructure. 我们已经确定了网格计算的三个主要缺点。这些未按特定顺序列出,并且并未包含所有内容。还有更多的缺点,但这些代表了如果您的基础设施中包含网格计算,您应该期望得到的结果。 * Not shared simultaneously. The grid computing infrastructure is not designed to be shared simultaneously without losing some of the benefit of running on a grid in the first place. This means that jobs and applications are usually scheduled ahead of time and not run on demand. * 不同时共享。网格计算基础设施的设计初衷并不是为了同时共享而不失去在网格上运行的一些好处。这意味着作业和应用程序通常是提前安排的,而不是按需运行。 * Monolithic app. If your application is not able to be divided into smaller tasks, there is little to no benefit of running on a grid. To take advantage of the grid computing infrastructure, you need to be able to break the application into nonsequential tasks that can run independently. * 单体应用程序。如果您的应用程序无法划分为更小的任务,那么在网格上运行几乎没有任何好处。为了利用网格计算基础设施,您需要能够将应用程序分解为可以独立运行的非顺序任务。 * Complexity. Running on a grid environment adds another layer of complexity to your application stack that is probably already complex. If there is a problem, debugging whether the problem exists because of a bug in your application code or the environment that it is running on becomes much more difficult. * 复杂。在网格环境上运行会为可能已经很复杂的应用程序堆栈增加另一层复杂性。如果出现问题,则调试该问题是否是由于应用程序代码或其运行环境中的错误而存在会变得更加困难。 These three cons are ones that you may see from integrating a grid computing system intoyour infrastructure. The significance of each one will depend on your specific application andimplementation. 通过将网格计算系统集成到基础设施中,您可能会看到这三个缺点。每一项的重要性取决于您的具体应用和实现。 These are the major pros and cons that we see with integrating a grid computinginfrastructure into your architecture. As we discussed earlier, the significance thatyou give to each of these will be determined by your specific application and technol-ogy team. As a further example of this, if you have a strong operations team that hasexperience working with or running grid infrastructures, the increased complexitythat comes along with the grid is not likely to deter you. If you have no operationsteam and no one on your team had to support an application running on a grid, thisdrawback may give you pause. 这些是我们看到的将网格计算基础设施集成到您的体系结构中的主要优点和缺点。正如我们之前所讨论的,您赋予其中每一项的重要性将由您的特定应用程序和技术团队决定。再举一个例子,如果您拥有一支强大的运营团队,并且拥有使用或运行网格基础设施的经验,那么网格带来的复杂性增加不太可能阻止您。如果您没有运营团队,并且团队中没有人必须支持在网格上运行的应用程序,那么这一缺点可能会让您犹豫不决。 If you are still up in the air about utilizing grid computing infrastructure, the nextsection is going to give you some ideas on where you may consider using a grid.Although you read through some of the ideas, be sure to keep in mind the benefitsand drawbacks covered earlier, because these should influence your decision ofwhether to proceed with a similar project yourself. 如果您对利用网格计算基础设施仍然犹豫不决,下一节将为您提供一些关于在哪里可以考虑使用网格的想法。尽管您阅读了一些想法,但请务必牢记其优点和缺点前面已经介绍过,因为这些会影响您是否自己继续进行类似项目的决定。 ####Different Uses for Grid Computing 网格计算的不同用途 In this section, we are going to cover some ideas and examples that we have eitherseen or discussed with clients and employers for using grid computing. By sharingthese, we aim to give you a sampling of the possible implementations and don’t con-sider this list inclusive at all. There are a myriad of ways to implement and takeadvantage of a grid computing infrastructure. After everyone becomes familiar withgrids, you and your team are surely able to come up with an extensive list of possibleprojects that could benefit from this architecture, and then you simply have to weighthe pros and cons of each project to determine if any is worth actually implementing.Grid computing is an important tool to utilize when scaling applications, whether inthe form of utilizing a grid to scale more cost effectively a single program in your pro-duction environment or using it to speed up a step in the product development cycle,such as compilation. Scalability is not just about the production environment, but theprocesses and people that support it as well. Keep this in mind as you read theseexamples and consider how grid computing can aid your scalability efforts. 在本节中,我们将介绍一些我们已经看到或与客户和雇主讨论过的使用网格计算的想法和示例。通过分享这些,我们的目的是为您提供可能的实现示例,并且根本不认为此列表包含所有内容。有多种方法可以实现和利用网格计算基础设施。当每个人都熟悉网格后,您和您的团队肯定能够列出可以从该架构中受益的可能项目的广泛列表,然后您只需权衡每个项目的优缺点,以确定是否有任何项目值得实际实施网格计算是扩展应用程序时使用的重要工具,无论是利用网格在生产环境中更经济有效地扩展单个程序,还是使用它来加快产品开发周期的一个步骤,例如汇编。可扩展性不仅与生产环境有关,还与支持它的流程和人员有关。当您阅读这些示例并考虑网格计算如何帮助您的可扩展性工作时,请记住这一点。 We have four examples that we are going to describe as potential uses for gridcomputing. These are running your production environment on a grid, using a gridfor compilation, implementing parts of a data warehouse environment on a grid, andback office processing on a grid. We know there are many more implementationsthat are possible, but these should give you a breadth of examples that you can use tojumpstart your own brainstorming session. 我们将描述四个示例,作为网格计算的潜在用途。它们是在网格上运行生产环境、使用网格进行编译、在网格上实现部分数据仓库环境以及在网格上进行后台处理。我们知道还有更多可能的实现,但这些应该为您提供广泛的示例,您可以使用它们来启动您自己的头脑风暴会议。 #####Production Grid 生产网格 The first example usage is of course to use grid computing in your production envi-ronment. This may not be possible for applications that require real-time user inter-actions such as Software as a Service companies. However, for IT organizations thathave very mathematically complex applications in use for controlling manufacturingprocesses or shipping control, this might be a great fit. Lots of these applications havehistorically resided on mainframe or midrange systems. Many technology organiza-tions are finding it more difficult to support these larger and older machines fromboth vendor support as well as engineering support. There are fewer engineers whoknow how to run and program these machines and fewer who would prefer to learnthese skill sets instead of Web programming skills. 第一个示例用法当然是在生产环境中使用网格计算。对于需要实时用户交互的应用程序(例如软件即服务公司)来说,这可能是不可能的。然而,对于拥有数学上非常复杂的应用程序来控制制造过程或运输控制的 IT 组织来说,这可能是一个非常合适的选择。许多此类应用程序历史上都驻留在大型机或中型系统上。许多技术组织发现,通过供应商支持和工程支持来支持这些更大、更旧的机器变得更加困难。知道如何运行和编程这些机器的工程师越来越少,愿意学习这些技能而不是 Web 编程技能的工程师也越来越少。 The grid computing environment offers solutions to both of the problems of machineand engineering support for older technologies. Migrating to a grid that runs lots ofcommodity hardware as opposed to one strategic piece of hardware is a way to reduceyour dependency on a single vendor for support and maintenance. Not only does thispush the balance of power into your court, it is possibly a significant cost savings foryour organization. At the same time, you should more easily be able to find alreadytrained engineers or administrators who have experience running grids or at the veryleast find employees who are excited about learning one of the newer technologies. 网格计算环境为旧技术的机器和工程支持问题提供了解决方案。迁移到运行大量商品硬件而不是一个战略硬件的网格是减少对单个供应商支持和维护的依赖的一种方法。这不仅可以将权力平衡推向法庭,还可能为您的组织节省大量成本。同时,您应该更容易找到已经接受过培训、有运行网格经验的工程师或管理员,或者至少找到对学习一项新技术感到兴奋的员工。 #####Build Grid 构建网格 The next example is using a grid computing infrastructure for your build or compila-tion machines. If compiling your application takes a few minutes on your desktop,this might seem like overkill, but there are many applications that, running on a sin-gle host or developer machine, would take days to compile the entire code base. Thisis when a build farm or grid environment comes in very handy. Compiling is ideallysuited for grids because there are so many divisions of work that can take place, andthey can all be performed nonsequentially. The later stages of the build that includelinking start to become more sequential and thus not capable of running on a grid,but the early stages are ideal for a division of labor. 下一个示例是为您的构建或编译机器使用网格计算基础设施。如果在桌面上编译应用程序需要几分钟,这可能看起来有些过大,但有许多应用程序在单个主机或开发人员计算机上运行,需要几天的时间来编译整个代码库。这时构建场或网格环境就派上用场了。编译非常适合网格,因为可以进行很多工作分工,而且它们都可以不按顺序执行。构建的后期阶段(包括链接)开始变得更加连续,因此无法在网格上运行,但早期阶段对于劳动分工来说是理想的。 Most companies compile or build an executable version of the checked in codeeach evening so that anyone who needs to test that version can have it available andbe sure that the code will actually build successfully. Going days without knowingthat the checked in code can build properly will result in hours (if not days) of workby engineers to fix the build before it can be tested by the quality assurance engineers.Not having the build be successful every day and waiting until the last step to get thebuild working will cause delays for engineers and will likely cause engineers to notcheck-in code until the very end, which risks losing their work and is a great way tointroduce a lot of bugs in the code. By building from the source code repository everynight, these problems are avoided. A great source of untapped compilation capacityat night is the testing environments. These are generally used during the day and canbe tapped in the evening to help augment the build machines. This concept of CPUscavenging was discussed before, but this is a simple implementation of it that cansave quite a bit of money in additional hardware cost. 大多数公司每天晚上都会编译或构建签入代码的可执行版本,以便任何需要测试该版本的人都可以使用它,并确保代码实际上会成功构建。如果几天不知道签入的代码是否可以正确构建,将导致工程师花费数小时(如果不是几天)来修复构建,然后质量保证工程师才能对其进行测试。让构建工作的最后一步将导致工程师延迟,并可能导致工程师直到最后才签入代码,这可能会导致他们的工作失败,并且是在代码中引入大量错误的好方法。通过每晚从源代码存储库进行构建,可以避免这些问题。夜间未开发的编译能力的一个重要来源是测试环境。这些通常在白天使用,可以在晚上利用以帮助增强构建机器。 CPUscavenging 的概念之前已经讨论过,但这是它的一个简单实现,可以节省大量额外的硬件成本。 For C, C++ , Objective C, or Objective C+ + , builds implementing a distributedcompilation process can be as simple as running distcc, which as its site (http://www.distcc.org) claims is a fast and free distributed compiler. It works by simply run-ning the distcc daemon on all the servers in the compilation grid, placing the namesof these servers in an environmental variable, and then starting the build process. 对于 C、C+ + 、Objective C 或 Objective C+ + ,实现分布式编译过程的构建可以像运行 distcc 一样简单,正如其网站 (http://www.distcc.org) 声称的那样,它是一个快速且免费的分布式编译器。它的工作原理是简单地在编译网格中的所有服务器上运行 distcc 守护进程,将这些服务器的名称放入环境变量中,然后启动构建过程。 #####Build Steps 构建步骤 There are many different types of compilers and many different processes that source code goesthrough to become code that can be executed by a machine. At a high level, there are eithercompiled languages or interpreted languages. Forget about just in time (JIT) compilers andbytecode interpreters; compiled languages are ones that the code written by the engineers isreduced to machine readable code ahead of time using a compiler. Interpreted languages use aninterpreter to read the code from the source file and execute it at runtime. Here are the rudimen-tary steps that are followed by most compilation processes and the corresponding input/output: 有许多不同类型的编译器和源代码经过许多不同的过程才能成为可以由机器执行的代码。在较高的层次上,要么有编译语言,要么有解释语言。忘掉即时 (JIT) 编译器和字节码解释器吧;编译语言是工程师编写的代码使用编译器提前减少为机器可读代码的语言。解释型语言使用解释器从源文件中读取代码并在运行时执行它。以下是大多数编译过程所遵循的基本步骤以及相应的输入/输出 * In Source code * 在源代码中 1.Preprocessing. This is usually used to check for syntactical correctness. 1.预处理。这通常用于检查语法的正确性。 * Out/In Source code * 输出/输入源代码 2.Compiling. This step converts the source code to assembly code based on the lan-guage’s definitions of syntax. 2.编译。此步骤根据语言的语法定义将源代码转换为汇编代码。 * Out/In Assembly code * 输出/输入汇编代码 3.Assembling. This step converts the assembly language into machine instructions or object code. 3.组装。此步骤将汇编语言转换为机器指令或目标代码。 * Out/In Object code * 输出/输入目标代码 4.Linking. This final step combines the object code into a single executable. 4.链接。最后一步将目标代码合并为单个可执行文件。 * Out Executable code * 输出可执行代码 A formal discussion of compiling is beyond the scope of this book, but this four-step processis the high-level overview of how source code gets turned into code that can be executed by amachine. 对编译的正式讨论超出了本书的范围,但这四个步骤的过程是对源代码如何转换为可由机器执行的代码的高级概述。 #####Data Warehouse Grid 数据仓库网格 The next example that we are going to cover is using a grid as part of the data ware-house infrastructure. There are many components in a data warehouse from the pri-mary source databases to the end reports that users view. One particular componentthat can make use of a grid environment is the transformation phase of the extract-transform-load step (ETL) in the data warehouse. This ETL process is how data ispulled or extracted from the primary sources, transformed into a different form—usually a denormalized star schema form—and then loaded into the data warehouse.The transformation can be computationally intensive and therefore a primary candi-date for the power of grid computing. 我们要介绍的下一个示例是使用网格作为数据仓库基础设施的一部分。数据仓库中有许多组件,从主要源数据库到用户查看的最终报告。可以利用网格环境的一个特定组件是数据仓库中提取-转换-加载步骤(ETL) 的转换阶段。这个 ETL 过程是如何从主要来源拉取或提取数据,将其转换为不同的形式(通常是非规范化的星型模式形式),然后加载到数据仓库中。该转换可能需要大量计算,因此是网格计算的力量。 The transformation process may be as simple as denormalizing data or it may be asextensive as rolling up many months’ worth of sales data for thousands of transactions.Processing that is very intense such as monthly or even annual rollups can often bebroken into multiple pieces and divided among a host of computers. By doing so, thisis very suitable for a grid environment. As we covered in Chapter 27, Too MuchData, massive amounts of data are often the cause of not being able to process jobssuch as the ETL in the time period required by either customers or internal users.Certainly, you should consider how to limit the amount of data that you are keepingand processing, but it is possible that the amount of data growth is because of anexponential growth in traffic, which is what you want. A solution is to implement agrid computing infrastructure for the ETL to finish these jobs in a timely manner. 转换过程可能像非规范化数据一样简单,也可能像汇总数月的数千笔交易的销售数据一样广泛。非常密集的处理(例如每月甚至每年的汇总)通常可以分为多个部分并划分在许多计算机之间。这样做非常适合网格环境。正如我们在第 27 章“数据太多”中提到的,海量数据往往会导致无法在客户或内部用户要求的时间段内处理 ETL 等作业。当然,您应该考虑如何限制数据量您保存和处理的数据量,但数据量增长可能是由于流量呈指数增长所致,这正是您想要的。解决方案是为ETL实施网格计算基础设施,以及时完成这些工作。 #####Back Office Grid 后台网格 The last example that we want to cover is back office processing. An example of suchback office processing takes place every month in most companies when they closethe financial books. This is often a time of massive amounts of processing, dataaggregation, and computations. This is usually done with an enterprise resourceplanning (ERP) system, financial software package, homegrown system, or somecombination of these. Attempting to use off-the-shelf software processing on a gridcomputing infrastructure when the system was not designed to do so may be chal-lenging but it can be done. Often, very large ERP systems allow for quite a bit of cus-tomization and configuration. If you have ever been responsible for this process orwaited days for this process to be finished, you will agree that being able to run thison possibly hundreds of host computers and finishing within hours would be a mon-umental improvement. There are many back office systems that are very computa-tionally intensive—end-of-month processing is just one. Others include invoicing,supply reordering, resource planning, and quality assurance testing. Use these as aspringboard to develop your own list of potential places for improvement. 我们要介绍的最后一个例子是后台处理。大多数公司每月关闭财务账簿时都会进行此类后台处理的一个例子。这通常是大量处理、数据聚合和计算的时期。这通常是通过企业资源规划 (ERP) 系统、财务软件包、自主开发的系统或这些系统的组合来完成的。当系统并非旨在这样做时,尝试在网格计算基础设施上使用现成的软件处理可能具有挑战性,但这是可以做到的。通常,非常大的 ERP 系统允许进行大量的定制和配置。如果您曾经负责此过程或等待数天才能完成此过程,您会同意能够在可能数百台主机上运行此过程并在数小时内完成将是一个巨大的进步。有许多计算密集型的后台系统——月末处理只是其中之一。其他包括发票、供应重新订购、资源规划和质量保证测试。使用这些作为跳板来制定您自己的潜在改进地点列表。 We covered four examples of grids in this section: running your production environ-ment on a grid, using a grid for compilation, implementing parts of a data warehouseenvironment on a grid, and back office processing on a grid. We know there aremany more implementations that are possible, and these are only meant to provideyou with some examples that you can use to come up with your own applications forgrid computing. After you have done so, you can apply the pros and cons along witha weighting score. We will cover how to do this in the next section of this chapter. 我们在本节中介绍了网格的四个示例:在网格上运行生产环境、使用网格进行编译、在网格上实现数据仓库环境的一部分以及在网格上进行后台处理。我们知道还有更多可能的实现,这些只是为了向您提供一些示例,您可以使用它们来提出您自己的网格计算应用程序。完成此操作后,您可以应用优点和缺点以及权重分数。我们将在本章的下一节中介绍如何做到这一点。 #####MapReduce 映射减少 We covered MapReduce in Chapter 27, but we should point out here in the chapter on gridcomputing that MapReduce is an implementation of distributed computing, which is anothername for grid computing. In essence, MapReduce is a special case grid computing frameworkused for text tokenizing and indexing. 我们在第27章中介绍了MapReduce,但在网格计算这一章中我们应该指出,MapReduce是分布式计算的一种实现,分布式计算是网格计算的别称。本质上,MapReduce 是一种特殊情况的网格计算框架,用于文本标记和索引。 ####Decision Process 决策过程 Now we will cover the process for deciding which ideas you brainstormed should bepursued. The overall process that we are recommending is to first brainstorm thepotential areas of improvement. Using the pros and cons that we outlined in thischapter, as well as any others that you think of, weigh the pros and cons based onyour particular application. Score each idea based on the pros and cons. Based on thefinal tally of pros and cons, decide which ideas if any should be pursued. We aregoing to provide an example as a demonstration of the steps. 现在我们将介绍决定应采用您集思广益的想法的过程。我们建议的总体流程是首先集思广益,找出潜在的改进领域。使用我们在本章中概述的优点和缺点以及您想到的任何其他优点和缺点,根据您的特定应用权衡利弊。根据优点和缺点对每个想法进行评分。根据最终的利弊统计,决定应该采用哪些想法(如果有)。我们将提供一个示例来演示这些步骤。 Let’s take our company AllScale.com. We currently have no grid computing imple-mentations but we have read The Art of Scalability and think it might be worthinvestigating if grid computing is right for any of our applications. We decide thatthere are two projects that are worth considering because they are beginning to taketoo long to process and are backing up other jobs as well as hindering our employeesfrom getting their work done. The projects are the data warehouse ETL and themonthly financial closing of the books. We decide that we are going to use the threepros and three cons identified in the book, but have decided to add one more con: theinitial cost of implementing the grid infrastructure. 以我们公司 AllScale.com 为例。我们目前还没有网格计算实现,但我们已经阅读了可扩展性的艺术,并认为网格计算是否适合我们的任何应用程序可能值得研究。我们认为有两个项目值得考虑,因为它们开始花费太长时间来处理并且正在支持其他工作以及阻碍我们的员工完成工作。这些项目是数据仓库 ETL 和每月财务结帐。我们决定使用书中确定的三个优点和三个缺点,但还决定添加一个缺点:实施网格基础设施的初始成本。 Now that we have completed step one, we are ready to apply weights to the prosand cons, which is step two. We will use a 1, 3, or 9 scale to rank these in order thatwe highly differentiate the factors that we care about. The first con is that the grid isnot able to be used simultaneously. We don’t think this is a very big deal because weare considering implementing this as a private cloud—only our department will uti-lize it, and we will likely use scavenged CPU to implement. We weigh this as a –1,negative because it is a con and this makes the math easier when we multiply and addthe scores. The next con is the inhospitable environment that grids are for monolithicapplications. We also don’t care much about this con, because both alternative ideasare capable of being split easily into nonsequential tasks. We care somewhat aboutthe increased complexity because although we do have a stellar operations team, wewould like to not have them handle too much extra work. We weight this –3.The lastcon is the cost of implementing. This is a big deal for us because we have a limitedinfrastructure budget this year and cannot afford to pay much for the grid. Weweight this –9 because it is very important to us. 现在我们已经完成了第一步,我们准备好对优点和缺点进行权重,这是第二步。我们将使用 1、3 或 9 等级对这些因素进行排名,以便我们高度区分我们关心的因素。第一个缺点是网格不能同时使用。我们认为这没什么大不了的,因为我们正在考虑将其实现为私有云——只有我们的部门才会使用它,而且我们可能会使用废弃的 CPU 来实现。我们将其权重为 –1,负数,因为它是一个缺点,这使得当我们将分数相乘和相加时,数学变得更容易。下一个缺点是网格对于整体应用程序的不利环境。我们也不太关心这个问题,因为这两种替代想法都可以轻松地分解为不连续的任务。我们有些关心增加的复杂性,因为尽管我们确实拥有一流的运营团队,但我们不希望他们处理太多额外的工作。我们对此进行加权 – 3。最后一个因素是实施成本。这对我们来说是一件大事,因为我们今年的基础设施预算有限,无法承担太多的电网费用。我们将其加权为-9,因为它对我们非常重要。 On the pros, we consider the fact that grids have high computational rates veryimportant to us because this is the primary reason that we are interested in the tech-nology. We are going to weight this + 9.The next pro on the list is that a grid is sharedinfrastructure. We like that we can potentially run multiple applications, in sequence,on the grid computing infrastructure, but it is not that important, so we weight it + 1.The last pro to weight is that grids can make us of unused capacity, such as with CPUscavenging. Along with minimizing the cost being a very important goal for us, thisability to use extra or surplus capacity is important also, and we weight it + 9.Thisconcludes step 2, the weighting of the pros and cons. 从优点来看,我们认为网格具有高计算速率这一事实对我们来说非常重要,因为这是我们对这项技术感兴趣的主要原因。我们将权重为 + 9。列表中的下一个优点是网格是共享基础设施。我们喜欢在网格计算基础设施上按顺序运行多个应用程序,但这并不重要,因此我们将其加权+ 1。权重的最后一个优点是网格可以使我们利用未使用的容量,例如与CPU清理。除了最小化成本对我们来说是一个非常重要的目标之外,使用额外或剩余产能的能力也很重要,我们将其权重为 + 9。这就是第 2 步,即利弊权重。 The next step is to score each alternative idea on a scale from 0 to 5 to demon-strate each of the pros and cons. As an example, we ranked the ETL project as shownin Table 30.1, because it would potentially be the only application running on thegrid at this time; thus, it has a minor relationship with the con of “not simultaneouslyshared.” The cost is important to both projects and because the monthly financialclosing project is larger, we ranked it higher on the “cost of implementation.” On thepros, both projects benefit greatly from the higher computational rates, but themonth financial closing project requires more processing so it is ranked higher. Weplan on utilizing unused capacity such as in our QA environment for the grid, so weranked it high for both projects. We continued in this manner scoring each projectuntil the entire matrix was filled in. 下一步是按照从 0 到 5 的等级对每个替代想法进行评分,以展示每个替代想法的优点和缺点。作为一个例子,我们对 ETL 项目进行了排名,如表 30.1 所示,因为它可能是此时网格上运行的唯一应用程序;因此,它与“不同时共享”的缺点关系不大。成本对这两个项目都很重要,并且由于每月财务结算项目较大,因此我们在“实施成本”方面将其排名较高。从优点来看,这两个项目都受益于较高的计算速率,但月份财务结算项目需要更多处理,因此排名较高。我们计划利用未使用的容量,例如在网格的 QA 环境中,因此我们对这两个项目都给予了很高的评价。我们继续以这种方式对每个项目进行评分,直到整个矩阵被填满。 Step four is to multiply the scores by the weights and then sum the products up foreach project. For the ETL example, we multiply the weight –1 by the score 1, add itto the product of the second weight –1 by the score 1 again, and continue in thismanner with the final calculation looking like this: (1 u –1) + (1 u –1) + (1 u –3) + (3u –9) + (3 u 9) + (1 u 1) + (4 u 9) = 32. ![](https://blog.baidu-google.com/usr/uploads/2024/06/2174783554.png) 第四步是将分数乘以权重,然后将每个项目的乘积相加。对于 ETL 示例,我们将权重 –1 乘以分数 1,再次将其与第二个权重 –1 与分数 1 的乘积相加,并继续以这种方式进行,最终计算如下所示: (1 u –1) + (1 u –1) + (1 u –3) + (3u –9) + (3 u 9) + (1 u 1) + (4 u 9) = 32。 As part of the final, we analyze the scores for each alternative and apply a level ofcommon sense to it. In this example, we have the two ideas—ETL and monthlyfinancial closing—scored as 32 and 44, respectively. In this case, both projects looklikely to be beneficial and we should consider them both as very good potentials formoving forward. Before automatically assuming that this is our decision, we shouldverify that based on our common sense and other factors that might not have beenincluded, this is a sound decision. If something appears to be off or you want to addother factors, you should redo the matrix or have several people do the scoringindependently. 作为决赛的一部分,我们分析每个选项的分数,并对其应用一定程度的常识。在此示例中,我们有两个想法——ETL 和月度财务结算——分别得分为 32 和 44。在这种情况下,这两个项目看起来都可能是有益的,我们应该将它们视为前进的良好潜力。在自动假设这是我们的决定之前,我们应该根据我们的常识和其他可能未包括在内的因素来验证这是一个合理的决定。如果出现问题或者您想添加其他因素,您应该重做矩阵或让几个人独立进行评分。 ![](https://blog.baidu-google.com/usr/uploads/2024/06/340026461.png) The decision process is designed to provide you with a formal method of evaluat-ing ideas assessed against pros and cons. Using these types of matrixes, the data canhelp us make decisions or at a minimum lay out our decision process in a logicalmanner. 决策过程旨在为您提供一种正式的方法来评估想法的利弊。使用这些类型的矩阵,数据可以帮助我们做出决策,或者至少以逻辑方式布置我们的决策过程。 #####Decision Steps 决策步骤 The following are steps to take to help make a decision about whether you should introducegrid computing into your infrastructure: 以下步骤可帮助您决定是否应将网格计算引入您的基础设施 1.Develop alternative ideas for how to use grid computing. 1.开发如何使用网格计算的替代想法。 2.Place weights on all the pros and cons that you can come up with. 2.权衡你能想到的所有优点和缺点。 3.Score the alternative ideas using the pros and cons. 3.使用优点和缺点对替代想法进行评分。 4.Tally scores for each idea by multiplying the score by the weight and summing. 4.通过将分数乘以权重并求和来计算每个想法的分数。 This decision matrix process will help you make data driven decisions about which ideasshould be pursued to include grid computing as part of your infrastructure. 这个决策矩阵过程将帮助您做出数据驱动的决策,决定应采取哪些想法将网格计算纳入您的基础设施。 As with cloud computing, the most likely question is not whether to implement agrid computing environment, but rather where and when you should implement it.Grid computing offers a good alternative to scaling applications that are growingquickly and need intensive computational power. Choosing the right project for thegrid for it to be successful is critical and should be done with as much thought anddata as possible. 与云计算一样,最可能的问题不是是否实施网格计算环境,而是应该在何时何地实施它。网格计算为扩展快速增长且需要密集计算能力的应用程序提供了一个很好的替代方案。为电网选择正确的项目以使其成功至关重要,并且应该通过尽可能多的思考和数据来完成。 ####Conclusion 结论 In this chapter, we covered the pros and cons of grid computing, provided some real-world examples of where grid computing might fit, and covered a decision matrix tohelp you decide what projects make the most sense for utilizing the grid. We dis-cussed three pros: high computational rates, shared infrastructure, and unused capac-ity. We also covered three cons: the environment is not shared well simultaneously,monolithic applications need not apply, and increased complexity. 在本章中,我们介绍了网格计算的优点和缺点,提供了一些网格计算可能适合的实际示例,并介绍了一个决策矩阵来帮助您确定哪些项目最适合利用网格。我们讨论了三个优点:高计算速率、共享基础设施和未使用的容量。我们还讨论了三个缺点:环境不能很好地同时共享、单体应用程序不需要应用以及复杂性增加。 We provided four real-world examples of where we see possible fits for grid com-puting. These examples included the production environment of some applications,the transformation part of the data warehousing ETL process, the building or com-piling process for applications, and the back office processing of computationallyintensive tasks. Each of these is a great example where you may have a need for fastand large amounts of computations. Not all similar applications can make use of thegrid, but parts of many of them can be implemented on a grid. Perhaps the entireETL process doesn’t make sense to run on a grid, but the transformation processmight be the key part that needs the additional computations. 我们提供了四个现实世界的例子,说明我们认为网格计算可能适合的地方。这些例子包括一些应用程序的生产环境、数据仓库ETL过程的转换部分、应用程序的构建或编译过程以及计算密集型任务的后台处理。其中每一个都是一个很好的例子,您可能需要快速且大量的计算。并非所有类似的应用程序都可以使用网格,但其中许多应用程序的一部分可以在网格上实现。也许整个 ETL 过程在网格上运行没有意义,但转换过程可能是需要额外计算的关键部分。 The last section of this chapter was the decision matrix. We provided a frameworkfor companies and organizations to use to think through logically which projectsmake the most sense for implementing a grid computing infrastructure. We outlined afour-step process that included identifying likely projects, weighting the pros/cons,scoring the projects against the pros/cons, and then summing and tallying the finalscores. 本章的最后一部分是决策矩阵。我们为公司和组织提供了一个框架,用于逻辑地思考哪些项目对于实施网格计算基础设施最有意义。我们概述了一个四步流程,包括确定可能的项目、权衡利弊、根据利弊对项目进行评分,然后汇总并统计最终得分。 Grid computing does offer some very positive benefits when implemented cor-rectly and the drawbacks are minimized. This is another very important technologyand concept that can be utilized in the fight to scale your organization, processes, andtechnology. Grids offer the ability to scale computationally intensive programs andshould be considered for production as well as supporting processes. As grid comput-ing and other technologies become available and more mainstream, technologistsneed to stay current on them, at least in sufficient detail to make good decisionsabout whether they make sense for your organization and applications. 如果正确实施,网格计算确实可以带来一些非常积极的好处,并且可以将缺点最小化。这是另一个非常重要的技术和概念,可以用来扩展您的组织、流程和技术。网格提供了扩展计算密集型程序的能力,并且应该考虑用于生产以及支持过程。随着网格计算和其他技术变得可用并变得更加主流,技术人员需要及时了解它们,至少要了解足够的细节,以便就它们是否对您的组织和应用程序有意义做出正确的决策。 #####Key Points 关键点 * Grid computing offers high computation rates. * 网格计算提供高计算速率。 * Grid computing offers shared infrastructure for applications using themsequentially. * 网格计算为依次使用它们的应用程序提供了共享基础设施。 * Grid computing offers a good use of unused capacity in the form of CPUscavenging. * 网格计算以 CPU 清理的形式很好地利用了未使用的容量。 * Grid computing is not good for sharing simultaneously with other applications. * 网格计算不利于与其他应用程序同时共享。 * Grid computing is not good for monolithic applications. * 网格计算不适合单一应用程序。 * Grid computing does add some amount of complexity. * 网格计算确实增加了一定程度的复杂性。 * Desktop computers and other unused servers are a potential for untapped com-putational resources. * 台式计算机和其他未使用的服务器是未开发的计算资源的潜力。
没有评论