Data is being used to transform almost every industry, and in financial services alone, banks are sitting on a goldmine of data. Big Data. Which comes fast, in big numbers, and in all flavors and colors. They obviously own a lot of internal data, both structured and unstructured: from transactions, e-mail logs, loan portfolios, website clickstreams or risk assessments.When these are mixed with external data, even richer insights follow. Like the information coming from social media, the smart grid or even weather reports. Just imagine what kind of customized services an insurance bank could offer if they leverage insights from mobile geo-location data.Yet many banks still mostly focus on internal processes, on operational efficiency, on cleaning up legacy systems or on responding to regulatory requirements. These measures may be essential, but banks will have to move beyond them. This is because their business models are being disrupted, as we speak, by competitors coming from the most unexpected sides.They are no longer players just like them, with legacy systems to upkeep, rules to follow and margins to think of—they are young wolves with one great idea, the digital savvy-ness to bring it cheap and fast to the consumer and nothing to lose. They are also coming from other markets, with a fresh, new view on what money really is, and how it should be handled. They are like Google in the launch of mobile wallet, PayPal and the Bitcoin phenomenon, or the sharing economy with its ‘crowd fund’ projects on peer-to-peer platforms.To survive this revolution, banks have to use their latest asset, information, to understand how they can serve their customers in the best way. They can do this in two ways:1. By using data to fight cyber crime and win the consumers’ trust back: tracking events in real time for anomalies, offering behavioral authentication or ‘reading’ content in chat forums to uncover new fraud methods.2. By using data to drive a 360 degree, customer-centric approach: recognizing them immediately on any channel, offering them a better mobile experience or enabling a far-reaching analysis of their customers’ spending behavior and how it impacts their savings.Many banks will face big challenges before they can tap into the power of Big Data. Too many data silos, spread over too many departments (57% of respondents in a Capgemini study) and the shortage of people that are skilled in data analytics (40% of respondents in a Capgemini study) being the most important ones. But more and more seem to understand that Big Data is no longer “just an option”. 90% of North American financial institutions for instance think that successful big data initiatives will define the winners of the future.So, the real question is: will you be one of those winners?
Month: February 2021
For those interested in functional ways they can tackle these problems, Dell EMC Isilon has built-in tools that aide in recovery from a ransomware attack; however, detection & prevention is a much better alternative. Fortunately Dell EMC partners with Superna and Varonis to offer ideal solutions.If you’re interested in how Dell EMC Isilon and Hortonworks customers tackle other challenges around gaining value from their big data, join our upcoming webinar on “Batch + real-time analytics convergence” in late November. Register here. One of the hottest topics for both DellEMC and Hortonworks today is how to protect big data repositories, data lakes, from the emerging breed of cyber-attacks. We sat down to discuss this topic to address some of the common questions we’ve faced, and would love to know your thoughts and contributions. Our thanks also to Simon Elliston Ball for his contributions to the discussion.Photo by Markus Spiske on UnsplashWhat are the new threats that pose a specific threat to big data environments?The threats to big data environments come in a few broad areas.First, these are ‘target rich’ environments. Years of consolidating data, in order to simplify management and deliver value to data science and data analytics, makes for an appealing destination for cyber attackers. These will be subject to many ‘advanced persistent threats’ – cyber attackers and organisations trying to use extremely focussed and targeted techniques ranging from spear-phishing to DDoS attacks to gain access to or exploit your big data platforms in some way.Second, they are powerful computational environments. So, things like encryption attacks, if they are ever unleashed on big data operating environments, could potentially spread very rapidly.Third, big data repositories are often accessible to many employees internally. In general, this is a good thing, as how else could organisations tap into the potential value of big data? But a comprehensive framework to monitor and manage data access and security is required to protect against possible abuse or exploits.What about big data environments that makes them more or less vulnerable to threats like WannaCry/Ransomware?The good news is that WannaCry and other ransomware variants currently in the field don’t really target the operating systems on which big data platforms run. The bad news is, it’s probably just a matter of time before they do. And the fact that these environments are very capable computational resources means that these sorts of exploits could spread fast, if steps aren’t taken to protect them.What are some best practices to limit the possible spread of malware like WannaCry?There’s a lot about the way big data platforms are architected that could potentially protect against these malware forms – assuming the right steps are taken. Here are some suggestions:First, conduct basic big-data hygiene. Many organisations have historically perceived big data environments, Apache Hadoop clusters etc., as internal-only resources, protected by the network firewall. This may well be the case (to a point), but the nature of APTs means that if it’s there, people will find a way to reach it. If you’ve left default passwords in place, haven’t set sensible access restrictions for employees (governed and audited by tools like Apache Ranger) and so on… get that all done! Access controls will also limit the spread of any encryptionware to accessible data sets to each compromised user/set of credentials.Test it! Conduct sensible security procedures to assess the potential vulnerabilities to your data stores via penetration testing and assessment. Deploy whatever countermeasures you deem necessary to limit the risk at hand to acceptable levels.Deploy behavioural security to protect your environment. The industry guesstimate is that there are 300 million new viruses and malware variants arriving each year. Signature based security will fail against ‘day zero’ threats, so behavioural analytics is essential to monitor the activity across the environment and detect as well as protect against potential infections. If a system notices large-scale read/write activity typical of an encryption attack (but VERY unusual for a normal data lake), then it can shut it down dynamically by policy.Set a sensible snapshot policy to allow for ‘rollbacks’ at the levels that meet the recovery point objectives and recovery time objectives set for key data sets. This won’t necessarily mean creating daily snapshots of a multi-petabyte data lake, but might mean that certain critical data have more routine snapshots than less critical data. You can of course set these tiers in policy, given the right resources. This is a massive boon for Hadoop Distributed File System (HDFS).Do IT organisations know how to set Recovery Point Objectives (RPOs) and Recovery Time Objectives (RTOs) for big data environments?One of the most common misunderstandings in deploying big data environments is that you can still think of RTOs and RPOs for the infrastructure as a whole. You can’t – it’s too large! You’d have to build in such a vast amount of redundancy as to make the whole thing commercially impossible. Rather, you need to set RTOs and RPOs for individual data sets or storage tiers within the environment. In this context, you need to allow sufficient slack in your resources for the right number of snapshots to be in place for key data sets to insulate you from risk. This might be anything from 30-50 percent unused capacity in a given storage tier, made available for snapshots, though the latter would be verging on overkill in most cases.What about tackling the employee challenge to big data security?It’s a critical part of protecting any environment, educating employees, as this will be a more likely first possible entry point into an organisation than anything else. Raising employee awareness around the dangers of spear phishing, modern malware attacks, and beyond. The standard tricks of redirecting people to websites and downloads, via sending dubious email attachments and beyond have become much more sophisticated.The people that attempt to hack a Hadoop cluster might start by hitting a system administrator with a Servicenow helpdesk request… This camouflage makes it difficult to spot. It’s important to remember that the people that are coming after these resources are good… not script kiddies or mass market ransomware opportunists, but people who are into causing serious damage, either for ideological or commercial reasons.Even with training, people will remain a weak link. Given another guesstimate that the “per event” reputational and regulatory impact of a breach can cost up to two percent of market cap, having good remediation policies, processes and technologies in place given the eventual inevitability of a breach is key.How do these security practices tie into wider security, risk and compliance objectives for a business?The critical component here is the audit piece, given need to know exactly where your data is being stored, controlled and processed, and what it’s being used for in an evolving regulatory context. This is something you both apply to your use of big data, but also something big data enables you to achieve, for other systems as well. The audit and exfiltration monitoring tools you build in as part of your hygiene planning around your big data are useful, for example… but these logs are no use without analytics, and without being able to cross-reference and cross-check other data resources, e.g. if a piece of personal information has been accessed on one system, does it also exist on others? And should it therefore have been deleted from all?The rise in the volumes of unstructured data represents a huge number of unknowns. As such, we are going to see a huge opportunity around digital transformation. Organisations are going to be forced to assess how they handle data and put in some big improvements in terms of the structure of their environments, their ability to do those analytics, pull back the information in a short amount of time and so on… else organisations may be exposed to potential regulator enforcement/investigation scrutiny for failure to embed within an organisation appropriate data governance and data security.
 Principled Technologies (PT) report commissioned by Dell EMC, “Get flexible, feature-rich datacenter management at a lower cost” March 2018, comparing Dell EMC FX2 with OpenManage vs. HPE Synergy with OneView. Ibid Principled Technologies performed a detailed look at “fast” in a few virtual environments using Dell EMC PowerEdge servers, Dell EMC Integrations and OpenManage suites. Testing with specific and popular environments was key, so Principled Technologies chose to test with Microsoft System Center and VMware vCenter.These environments are deployed globally across thousands of customers. While hardware can be easily deployed and placed into the infrastructure, without a complete solution that allows IT to configure, deploy and manage within those environments, you could easily waste significant time and effort trying to make it all work – and the premise of “fast” is lost.Getting to “fast” in your environmentsPrincipled Technologies put Dell EMC PowerEdge with OpenManage to the test against HPE Synergy with OneView, and showcased that ”fast” can profoundly impact your time-to-business results: Up to 97% faster and 20 fewer steps than HPE deploying three Microsoft Windows servers with Dell EMC PowerEdge and OpenManage Essentials!When they tested with Microsoft System Center, the results were still impressive:Up to 87% faster and 24 fewer steps than HPE deploying multiple servers with the Dell EMC PowerEdge FX2 with OpenManage Integration for Microsoft System Center Configuration ManagerUp to 46% faster than HPE deploying a single host with Dell EMC PowerEdge FX2 with OpenManage Integration for Microsoft System Center Virtual Machine ManagerLeveraging the Dell EMC OpenManage Integration for Microsoft System Center enabled faster deployments for the business compared to a solution using HPE Synergy with OneView. Furthermore, no additional hardware such as a dedicated PXE server or deployment network was required, resulting in no lost time or additional expense. OpenManage integrates readily with Microsoft System Center Virtual Machine Manager to quickly expand a virtual infrastructure.Testing also revealed that configuring, deploying and maintaining Microsoft Windows servers with Dell EMC OpenManage Essentials made standard operations faster:OpenManage Essentials saved over 22 minutes (up to 91% less time) to deploy a single Windows serverDell EMC PowerEdge FX2 with OpenManage Essentials can update firmware for three Windows Servers up to 27% faster than HPE Synergy solution with OneView6 fewer steps than HPE updating firmware for three Windows ServersSimilarly, OpenManage Integrations for VMware vCenter allows IT admins to manage servers directly from within vCenter. Tasks like creating profiles, templates to deploy and firmware updates can easily be done without leaving vCenter. Doing similar tasks with HPE Synergy requires users to exit from the vCenter console. This raises the risk of more user errors with extra steps and time to complete. With vCenter integration, typical tasks take less than 10 steps and under four minutes. Savings like these could shrink time and effort for IT admins compared to using HPE OneView for VMware vCenter with HPE Synergy.Along with Dell EMC Integrations, IT can scale management further with the Dell EMC OpenManage Enterprise suite. The powerful suite helps you simplify, automate and unify management tasks, across a data center with up to thousands of PowerEdge servers.Further, setting up the Dell EMC PowerEdge FX with 14th gen FC640 blades for this environment saved up to 36% on costs including deployment, or a savings of more than $140,000.With time and cost savings, IT can be “fast” and get much more done in Microsoft and VMware environments using Dell EMC FX2 servers, Integrations and OpenManage.The result is “fast” can really work to a company’s advantage, and get you the results you want.“Fast” time-to-business can be an overused term and hard to scope in terms of ROI, but ‘fast” can be achieved. Principled Technologies results show how systems, solutions, and suites that work all together can easily fit into your environment. And this combo can yield great gains. IT gets better control of management. Doing so lets IT quickly turn idle resources and newly installed equipment into working and managed workhorses for your businessWhile your IT and infrastructure may not be bullet-train fast, working with the right systems and solutions can enable your business to get “fast-er” results.To learn more, visit DellEMC.com/Choose-PowerEdge From deployment to management, Principled Technologies put Dell EMC PowerEdge with OpenManage to the test versus HPE Synergy with OneViewLike a bullet train, the term “fast” implies high speed and performance. The term also applies to how quickly a task can be accomplished – helping you stay ahead of your deadlines to meet your goals.The metrics that contribute to whether or not something is ”fast” can vary according to the end results. In business, it is all about expanding in the context of a competitive landscape. Delivering results and improving time-to-market applies to services and products. The engine delivering these metrics for the business is IT.Getting to “fast” IT often means expanding in order to serve existing and new business needs. For IT, speed is critical when buying new products to run business workloads. “Fast” also means managing the ever-growing infrastructure and data center. And now, IT has to get the most out of each resource for the next job, the next application, the next need, and so on. Lastly, ”fast” also helps you integrate new equipment into your specific environment without downtime, hassles, or losing time configuring and waiting.The end result is “fast”-er time-to-business.
How can Artificial Intelligence save you money? There is no shortage of ideas in which we might be able to use Machine Learning to change the way our businesses operate. This is certainly the case for many of the world’s leading retailers who are looking at how to use Machine Learning to improve the overall operations of their business on many different fronts, from the edge to the core to the cloud. In this article, we’ll provide an overview of how we’re able to provide a Machine Learning solution to enhance the shopping experience in partnership with Malong and their RetailAI software stack. We are excited to share a tangible, working use case to show how Dell Technologies, in conjunction with NVIDIA’s GPU-accelerated EGX platforms, provide an unmatched experience in terms of inference throughput and latency.What is Machine Learning? Machine Learning is just as it sounds: a machine that can learn. Simply put, it is teaching a computer to learn and make decisions based on its learning. And in the retail world, it can cut costs and save time.Figure 1 – Rethinking LossDeploying Machine Learning Deploying Machine Learning at the edge isn’t as easy as training a model and testing it with data from the real world, also known as a “test set”. There is still a process that needs to be followed much like modern software stacks and this looks and feels like what is known as Continuous Integration and Continuous Delivery (CI/CD). NVIDIA’s EGX platform simplifies this process by providing a set of tools, utilities, and packaging spanning model training, deployment, and monitoring, while at the same time providing unmatched performance within the industry for Deep Learning workloads. The familiar challenges of configuring and deploying the many versions of hardware drivers along with version of NVIDIA CUDA and CuDNN has been solved by NVIDIA through their EGX [Fig 1] platform from hardware drivers on through to Kubernetes specific plugins. The EGX platform pushes this one step further and also provides vertical specific solutions such as healthcare and smart cities, or in this case retail, specifically using the NVIDIA DeepStream SDK. DeepStream streamlines the tools, utilities, and processes required to deploy a use case such as this. Figure 2 provides a quick visual overview of the architecture of the ML pipeline.Figure 2Dell Technologies is the number one technology infrastructure provider, providing an end-to-end portfolio in networking, compute, and storage and there is no denying the need for all of these components in a production Machine Learning scenario. Throughput of an end-to-end Machine Learning pipeline depends on the throughput of all constituent components of the system. Dell and NVIDIA continue to showcase their performance as outlined below.Figure 3In this use case we will be utilizing the Dell PowerEdge R7425 [Fig 3], which is our latest 2-socket, 2U rack server. It was designed specifically with an eye on complex workloads that require different properties of compute such as scalable memory, CPU, and networking options. This platform also allows for flexible storage requirements in the myriad environments that it’s deployed, and anyone who has been in production environments for any amount of time understands how variable these environments are. This is at the heart of why we designed and engineered such a flexible platform to begin with.The modularity and scalability of this platform enables scaling up and down respective of the requirements of the environment. Within this platform we’re also leveraging NVIDIA T4 [Fig 4] accelerators that afford us access to Tensor Core technology and it’s multi-precision capabilities to speed up inference on our platform.Figure 4Malong AIWhen we leverage everything outlined above we are able to realize the impact that these technologies have on how we own and operate our businesses today. In our partnership with Malong AI we are able to utilize the Dell Technologies and NVIDIA platforms described above to accelerate the identification of fraudulent scanning of items on Self Check Out (SCO) lanes in a retail environment. There are many ways in which a person can attempt to perform scan fraud with respect to an SCO system. Some of the most common include mistakenly (or intentionally) failing to scan a barcode on an item. While others are more nefarious in that a person will perform something called “ticket switching” and replace the barcode of the item they’re scanning with something less expensive.Malong’s technology comes into play leveraging Deep Neural Network based computer vision methods to classify the items that are scanned and identifying whether or not the correct SKU is registered within the SCO system. It is important to note, as well, that Malong’s model is one of the best performing _for this task_ in the world, winning the 2017 WebVision competition by a wide margin of a relative error of 50%. This model runs on top of the Dell EMC PowerEdge R7425 shown in figure 3, which also contains the NVIDIA T4 GPU shown in figure 3. The Malong AI software stack is built on top of NVIDIA Metropolis, which is part of the Smart City solutions encompassed by the EGX platform. Combined, these solutions from Dell and NVIDIA allow an end-to-end Machine Learning pipeline to be run physically on-site in a given retail outlet. This is the quintessential definition of an edge deployment and it proves to provide immense value to those retailers willing to re-think the way they approach loss prevention.In Figure 5, we can see the performance numbers of this platform for this use case.Figure 5Between utilizing the Turing architecture of the NVIDIA T4 GPU and some optimization respective to TensorRT, we’re able to provide a 480%+ increase in throughput of the pipeline. This time savings quickly translates to less lost revenue from product either innocently or intentionally being mis-scanned. Which in turn translates to real savings for retailers who face these loss-prevention challenges day in and day out.Dell Technologies and NVIDIA, along with their large partnership ecosystems, are the perfect fit to help with your digital transformation leveraging Machine Learning to help your retail company save time and money at the edge, within your loss-prevention strategies and investments. H. Zhao, O. Gallo, I. Frosio and J. Kautz, “Loss Functions for Image Restoration with Neural Networks,” in IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47-57, March 2017 L. J. Buturovic and L. T. Citkusev, “Back propagation and forward propagation,” [Proceedings 1992] IJCNN International Joint Conference on Neural Networks, Baltimore, MD, 1992, pp. 486-491 vol.4. S. Guo, W. Huang, H. Zhang, C. Zhuang, D. Dong, M. R. Scott, and D. Huang. “Curriculum Net: Weakly Supervised Learning from Large-Scale Web Images,” in the European Conference on Computer Vision (ECCV), September 2018. B. Patel, M. Scott, H. Wei. “Retail Analytics with Malong RetailAI® on DELL EMC PowerEdge servers” October 2019.
CAPE CANAVERAL, Fla. (AP) — NASA and others are marking the 35th anniversary of the Challenger launch disaster. Ceremonies were held at Kennedy Space Center and elsewhere Thursday to honor the seven killed shortly after liftoff on Jan. 28, 1986. The pandemic kept this year’s remembrance more muted than usual. About 100 people gathered at Kennedy’s Space Mirror Memorial for the late morning ceremony, held almost exactly the same time as the accident. The widow of the Challenger commander observed the anniversary from her home in Tennessee. She says the presence of teacher Christa McAuliffe on the flight added to the crew’s legacy.
NAIROBI, Kenya (AP) — Burundi has sentenced 34 people to life in prison after accusing them of a coup attempt against the former president, but all are living in exile. Many of those sentenced were informed only this week of the Supreme Court’s ruling made last year and seen by The Associated Press. The accused include a former army chief, journalists and human rights activists. A lawyer for one of them dismissed the ruling as “a political decision” and accused Burundi’s judiciary of being under the ruling party’s sway. Burundi saw a violent crackdown on protests in 2015 when then-President Pierre Nkurunziza decided to pursue a third term.